Job Description_ Data Engineer _ TransOrg Analytics
Why would you like to join us?
TransOrg Analytics specializes in Data Science, Data Engineering and Generative AI, providing advanced
analytics solutions to industry leaders and Fortune 500 companies across India, US, APAC and the Middle
East. We leverage data science to streamline, optimize, and accelerate our clients' businesses.
Visit at www.transorg.com to know more about us.
Responsibilities:
Design, develop, and maintain robust data pipelines using Azure Data Factory and Databricks
workflows.
Implement and manage big data solutions using Azure Databricks.
Design and maintain relational databases using Azure Delta Lake.
Ensure data quality and integrity through rigorous testing and validation.
Monitor and troubleshoot data pipelines and workflows to ensure seamless operation.
Implement data security and compliance measures in line with industry standards.
Continuously improve data infrastructure (including CI/CD) for scalability and performance.
Design, develop, and maintain ETL processes to extract, transform, and load data from
various sources into Snowflake.
Utilize ETL tools (e.g., ADF, Talend) to automate and manage data workflows.
Develop and maintain CI/CD pipelines using GitHub and Jenkins for automated deployment of
data models and ETL processes.
Monitor and troubleshoot pipeline issues to ensure smooth deployment and integration.
Design and implement scalable and efficient data models in Snowflake.
Optimize data structures for performance and storage efficiency.
Collaborate with stakeholders to understand data requirements and ensure data integrity
Integrate multiple data sources to create data lake/data mart Perform data ingestion and ETL
processes using SQL, Scoop, Spark or Hive
Monitor job performances, manage file system/disk-space, cluster & database connectivity, log
files, manage backup/security and troubleshoot various user issues
Design, implement, test and document performance benchmarking strategy for platforms as
well as for different use cases
Setup, administer, monitor, tune, optimize and govern large scale implementations
Drive customer communication during critical events and participate/lead various operational
improvement initiatives
Qualifications, Skill Set and competencies:
Bachelor's in Computer Science, Engineering, Statistics, Math’s or related quantitative degree.
3 - 6 years of relevant experience in data engineering.
Must have worked on any of the cloud engineering platforms - AWS, Azure, GCP, Cloudera
Proven experience as a Data Engineer with a focus on Azure cloud technologies/Snowflake.
Strong proficiency in Azure Data Factory, Azure Databricks, ADLS, and Azure SQL Database.
Experience with big data processing frameworks like Apache Spark.
Expert level proficiency in SQL and experience with data modeling and database design.
Knowledge of data warehousing concepts and ETL processes.
Strong focus on PySpark, Scala and Pandas.
Proficiency in Python programming and experience with other data processing frameworks.
Solid understanding of networking concepts and Azure networking solutions.
Strong problem-solving skills and attention to detail.
Excellent communication and collaboration skills.
Azure Data Engineer certification – AZ-900 and DP-203 (Good to have)
Familiarity with DevOps practices and tools for CI/CD in data engineering.
Certification: MS Azure / DBR Data Engineer
Data Ingestion - Coding & automating ETL pipelines, both batching & streaming. Should have
worked on both ETL or ELT methodologies using any of traditional & new age tech stack- SSIS,
Informatica, Databricks, Talend, Glue, DMS, ADF, Spark, Kafka, Storm, Flink etc.
Data transformation - Experience working with MPPs, big data & distributed computing
frameworks on cloud or cloud agnostic tech stack- Databricks, EMR, Hadoop, DBT, Spark etc,
Data storage - Experience working on data lakes, lakehouse architecture- S3, ADLS, Blob, HDFS
DWH - Strong experience modelling & implementing DWHing on tech like Redshift, Snowflake,
Azure Synapse, Bigquery, Hive
Orchestration & lineage - Airflow, Oozie etc.