Data Engineer with experience in building robust web applications, database models, and ETL pipelines. An experienced and constant learner that aims to help innovate solutions that can positively impact the industry. I am good at...
I use Python's Pandas, Matplotlib, NumPy, Scikit-learn, and Orange for Data Analysis. I can also use Google Cloud Platform services for such as AutoML, BigQuery, and Looker Studio.
I know RDBMS and NoSQL for Database Management, namely PostgreSQL, MySQL, SQL Server, SQLite, Oracle, and MongoDB. I also have experience in performing queries and managing Hive Data Warehouse.
As a data engineer, I have experience in using three programming languages, namely Python, Java, and Scala. The frameworks I’ve used for large-scale data processing and management are Google Cloud Functions, Pub/Sub, Dataproc, and Cloud Composer.
The project's objective is to predict an animal's class given some of its characteristics such as hair, feathers, backbone, fins, etc. The machine learning algorithms used to create the classifier are Logistic Regression and Decision Tree.
Python imports and dataset
Preparing and Training the data
Final dataset (with prediction) and data visualization
Class Project for Web Applications Development 1
Calm Flight is a flight reservation system for Domestic flights within the Philippines. Through this system, the customer and employee can login. In customer login, all flight transactions are recorded and for the admin login, flight destinations can be set, including transactions for hotel reservation.
Home Page
Flight Search Results
Registration
Login
This project contains an end-to-end data pipeline for processing Johns Hopkins University COVID-19 dataset. This pipeline has a backfill for the date range of January 2020 to March 2023. It consists of three main components: data loader, transformer, and data exporter.
Data Pipeline Architecture
Class Project for Information and Software Assurance and Security
The United Nations found that around 265 million of the children are out of school and approximately 22% of them are supposedly enrolled in primary school. People, regardless of age, race, or gender, have right to education. To ensure that everyone has access to education, the United Nations established a goal for quality education as one of their sustainable development goals, a blueprint to achieve a sustainable future for all. This aims to provide an inclusive and quality education for all, and to promote lifelong learning.
DigiWiz, an open-source learning platform, was created to help support United Nations’ goal for quality education. To evaluate the effectivity of the system, 10 primary school students and 4 college students are asked to test the system. Based on the results from the respondents’ assessment, majority of the respondents agreed that the system met the functionality, performance, reliability, supportability/security, and usability.
The project documentation is provided below.
Home Page
Courses Page
Course Details Page
Admin Dashboard Page
Project for Internship 2
The project extracts data from various database services, transform it into a specific format and loads it into SQL Server.
Capstone Project in partial fulfillment of the requirements for the Degree of Bachelor of Science in Information Technology with specialization in Service Management and Business Analytics.
The project is an e-commerce application that aims to help TECHNOHOLICS in monitoring transactions and forecasting sales of the business to come up with better marketing strategies based on the customers’ purchase data. Using the Apriori algorithm, the system generated the frequent item set of each customer and from it, association rules were produced to provide suggestions or recommendations about what the customer might purchase.
Home Page
Product Details Page
Shopping Cart Page
Admin Dashboard Page
Sales Forecast
Feedback Page
Class Project for Web Applications Development 2
Login Page
Payroll Data
Employee Data
About Page
The project's objective is to answer general queries that the user may ask. Using Deep Neural Networks in TensorFlow, the chatbot was able to understand and learn the text that the user inputs.
Project for Internship 1
The aim of the social media analytics project is to analyze the feedback given by people to the clients via social media and personal blogs.
Meanwhile, the data annotation project is used in preparation for the data analysis to be made by the company's data analysts.
Class Project for Database Management 2
Student Record System is a management information system for education establishments to manage student data. It is used by professors to provide data for all students. The importance of this is the ability to report information of the student grades. A second benefit, particularly with automated systems, is the efficiency in processing and exchanging student records among schools. When student records are added to an overall management information system that includes information on staff, materials, and budgeting for the school or school district, more management activities can be accomplished and efficiency will be improved. Student record systems, thus, play a key role in the overall functioning of the education system; but more importantly, they increase a school's ability to meet the needs of students.
This project is a Java application that connects to Oracle database.
Entity Relationship Diagram
Class Project for Analytics Application
Social media nowadays has become an integral part of life, the rise of the different social networking platforms is inevitable. One example is Twitter which is a free social networking site that allows people to share their thoughts and opinions using tweets. With millions of people tweeting every day, it is clear that behind those tweets are emotions that the users express.
The software Orange is used to classify tweets based on Ekman’s 6 Universal Emotions Theory and identify whether the tweets are positive, negative, or neutral. 1000 tweets were gathered using the Twitter API via its query search for content. Before using the fetched data for analysis, the tweets were preprocessed using the preprocess text widget to remove unnecessary words or punctuation marks commonly found in the word cloud created beforehand. Sentiment Analysis was used to determine and predict the emotions behind the tweets of each user. Vader's technique was used in the sentiment analysis widget and Ekman’s emotion classifier was used in the tweet profiler widget. To visualize the results of the analysis, boxplot, distribution chart, and heat map were used.
For the complete project overview, the documentation is provided below.
Orange is an open-source data visualization, machine learning and data mining toolkit.
Project's Overall Workspace
Preprocessed data visualized using the Word Cloud widget
The results gathered from the analysis were visualized in a Box Plot
I worked with Chris, he was an Intern at Acudeen. Chris was knowledgable in Python and he used that in crafting and provisioning the ETL project at Acudeen. He is open to learning new technology and libraries within Python and curious at Javascript as well. He is easy to work with and get along with colleagues pretty well.
I've had the opportunity to be on the same project with him numerous times wherein I was able to saw his different skills and display his good attitude with the group. Skills like keen analyzation of data and superb management of the database are only an example of what I believe why Chris would be a perfect fit as a Data Engineer.
He has vast knowledge on different programming languages and has the motivation to learn more languages inclined to modern industry. I was a member of his group in one of our projects and he has shown not just great programming skills but also leadership skills.