Innovation Hub I Graduate - Data Engineer, Cairo, Egypt
Deloitte
- Cairo
- Permanent
- Full-time
- Data Ingestion and Extraction: Collect and extract data from various sources, such as databases, APIs, logs, and flat files. This involves setting up data pipelines or ETL (Extract, Transform, Load) processes to gather and prepare the data for analysis.
- Data Transformation: Clean, pre-process, and transform raw data into structured formats suitable for analysis. This includes handling missing data, data normalization, and data quality checks.
- Database Management: Design, develop, and maintain databases and data warehouses. Optimize database performance, manage schema changes, and ensure data security.
- Data Modelling: Create and maintain data models to support business reporting and analytics. This might involve designing star or snowflake schemas for data warehouses or defining data structures for NoSQL databases.
- ETL Development: Develop and maintain ETL jobs and workflows using tools like Apache Spark, Apache Nifi, or cloud based ETL services. Ensure data pipelines are reliable and scalable.
- Data Quality Assurance: Implement data quality checks and validation procedures to identify and rectify data anomalies or discrepancies. Monitor data pipelines for errors and perform data reconciliation.
- Reporting and Visualization: Collaborate with data analysts and data scientists to create dashboards and reports that provide insights to clients. Use tools like Tableau, Power BI, or custom visualization libraries.
- Documentation and Communication: Maintain documentation for data pipelines, data models, and processes. Communicate findings and project progress effectively to clients and team members.
- Bachelor's Degree in a Relevant Field: While not always a strict requirement, having a bachelor's degree in a field such as computer science, data engineering, information technology, or a related discipline can enhance your qualifications and understanding of core concepts.
- Data Processing Tools and Technologies: Proficiency in data processing tools and technologies commonly used in data engineering roles, such as Apache Spark, Hadoop, Apache Kafka, or cloud-based data processing services (e.g., AWS Glue, Azure Data Factory).
- Database Skills: Familiarity with relational databases (e.g., SQL, PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra) is important for data storage and retrieval.
- ETL (Extract, Transform, Load) Knowledge: Understanding of ETL processes and tools for data ingestion, transformation, and loading. Experience with ETL frameworks like Apache NiFi or Talend can be advantageous.
- Scripting and Programming: Proficiency in scripting languages like Python or programming languages like Java can be valuable for data engineering tasks, including data manipulation and automation of data pipelines.
- Data Modelling and Schema Design: Basic knowledge of data modelling concepts, including the design of data schemas for data warehouses, data lakes, or NoSQL databases.