Azure Databricks Architect (Part-time or Full-time Consulting Opportunity)
Overview
- **Please Note: This is a part-time or full-time consulting opportunity on a contract basis. Contract-to-hire option available for the right candidate. We are open to considering candidates who are working full-time but can commit to 4 hours/ day for part-time consulting.
Candidates must be reside in the Eastern Time Zone as working hours for this role will be 7am-4pm EST. Only US citizens and permanent residents (green card holders) are eligible to apply for this position. No visa sponsorships available.**** The Azure Databricks Architect is a key position responsible for the setup and ongoing management of the Databricks infrastructure on Azure for clients. This role provides strategic direction, operational oversight and in-depth technical expertise to the infrastructure and data engineers for both internal staff and client organizations.
Key Responsibilities
Azure Infrastructure · Configure Virtual Networks (VNets) with appropriate subnetting · Establish Network Security Groups (NSGs) and firewall rules · Set up Azure Private Link for secure connectivity · Deploy Azure Data Lake Storage Gen2 (ADLS Gen2) with lifecycle policies · Configure Azure Key Vault for secrets management · Implement Azure Monitor and Log Analytics workspace · Establish Azure AD/Entra ID integration for SSO · Deploy Azure Virtual Machine or Container Instances for Apache Airflow · Configure SFTP server (Azure VM or Azure Storage SFTP) for data ingestion Databricks Platform · Provision Databricks workspaces for Dev, Staging, and Production environments · Enable Unity Catalog for centralized data governance · Configure workspace-level security and network isolation · Set up cluster policies and job compute configurations · Implement Auto-Loader for streaming data ingestion · Deploy Delta Live Tables (DLT) for declarative pipeline development · Enable Delta Lake features: time travel, Z-Order optimization, VACUUM operations Orchestration & Workflow Management · Deploy Apache Airflow on Azure compute (VM or AKS) · Configure Airflow DAGs for data pipeline orchestration · Integrate Airflow with Databricks Workflows · Set up job scheduling and dependency management · Implement retry logic and error handling · Configure Airflow UI access and authentication Governance & Access Control · Configure Unity Catalog metastore and catalogs · Implement Role-Based Access Control (RBAC) · Enable attribute-based access control (ABAC) · Configure column masking for sensitive data (PHI) · Implement row-level security (RLS) policies · Establish data lineage tracking · Enable audit logging and compliance reporting HIPAA Compliance & Security · Implement encryption at rest (Azure Storage Service Encryption) · Configure encryption in transit (TLS 1.2+) · Enable Azure AD authentication and MFA · Implement network isolation with Private Link · Configure audit logging for all PHI access · Establish data retention and destruction policies · Implement Business Associate Agreement (BAA) requirements · Configure backup and disaster recovery for PHI data · Enable security scanning and vulnerability assessments Monitoring & Observability · Centralize logging in Azure Log Analytics · Configure real-time alerting for pipeline failures · Create health dashboards for critical systems · Track SLA compliance and performance metrics · Monitor Databricks job performance and cluster health · Implement cost monitoring and budget alerts · Set up incident notification workflows CI/CD & DevOps · Set up Azure DevOps or GitHub for source control · Implement Infrastructure as Code using Terraform · Create automated testing framework for data pipelines · Configure CI/CD pipelines for code deployment · Implement blue-green or canary deployment strategies · Establish version control for notebooks and configurations Testing & Validation · Deploy test SFTP connection with dummy data source · Validate end-to-end data flow from SFTP to Gold layer · Perform data quality validation tests · Conduct performance and load testing · Execute security and compliance testing · Validate backup and recovery procedures
Qualifications
- Bachelor's degree in Computer Science, Information Technology, or a related field is required.
- A minimum of 5 years of management related experience is required in consulting or corporate setup; Life science or healthcare environment preferred.
- A minimum of 5 years of experience in setting up and managing Databricks infrastructure on Azure.
- A minimum of 5 years of experience managing teams, setting priorities and managing project delivery.
- Strong technical expertise in areas such as server administration and cybersecurity.
- Azure and Databricks certification strongly preferred.
Key Capabilities
- Strong problem-solving, analytical, and decision-making abilities.
- Creativity, clear thinking, and excellent communication skills.
- Ability to motivate others to solve challenging technology problems, using a combination of engineering expertise, innovation and leadership.
- Ability to quickly grasp new concepts and to formulate an action plan in a new situation.
- Ability to spearhead and oversee complex projects.
- Excellent leadership and people management skills, with the ability to build and motivate high-performing teams.
- Excellent collaboration, and stakeholder management skills.
- Excellent time management skills to prioritize tasks effectively.
Compensation: Based on experience and skill set; not a limitation for the right candidate. Job Type: Full-time Pay: $100.00 - $150.00 per hour Benefits:
- 401(k)
- 401(k) matching
- Dental insurance
- Flexible schedule
- Health insurance
- Life insurance
- Paid time off
- Professional development assistance
- Vision insurance
Application Question(s):
- Do you have hands-on experience setting up and managing Databricks on Azure ?
Experience:
- Databricks: 2 years (Preferred)
- Azure: 5 years (Required)
Work Location: Remote Apply tot his job Apply To this Job