case study

Atlas Bridge

ETL

Backend

Node.js

Koa

Datadog

CronJob

Transactions

Conflict resolution

Built a bi-directional ETL bridge that keeps the legacy Atlas Phoenix monolith and the modern Atlas micro-services platform consistent during a long deprecation period.

Overview & Problem

The Atlas ecosystem consists of a legacy Phoenix monolith and the new full-stack JavaScript Atlas Platform built as micro-services.

Both systems are temporarily running in parallel in production until we deprecate the legacy monolith. Without a sync mechanism, data between the systems would quickly become inconsistent, leading to operational issues and increased maintenance overhead.

Solution

I built Atlas Bridge, an ETL system in Node.js, using Koa, that keeps both systems in sync automatically. It handles bi-directional data flow, ensures transactional integrity, and leverages Datadog observability for production reliability.

Atlas Bridge runs on a scheduled CronJob that checks for data changes every minute using the updated_at field in both databases to determine the most recent change.

System Architecture

Highlights

Atlas Bridge sits at the centre, handling ETL, data transformation, and conflict resolution.
Cronjobs trigger scheduled syncs, ensuring up-to-date consistency.
Datadog observability provides metrics, alerting, and dashboards for system health and reliability.

Technical Challenges & Solutions

Bi-directional sync: prevented overwriting newer updates via timestamp-based conflict resolution.
Schema differences: transformation layer reconciles legacy Atlas fields with new micro-service schemas.
Transactional safety: database transactions ensure atomic updates and prevent partial writes.
Production reliability: observability in Datadog ensures failures are detected, tracked, and alerted immediately.

Code Snippets

// Paginate tenant Postgres (Knex) with offset/limit. Process each page in parallel,
// recurse until a short page. Stages ordered: users/staff → teams/categories → geofences/members → reconcile deletes.
// e.g. Promise.all([ batchProcess(users...), batchProcess(staff...), ... ])

async function batchProcess<R, P>(
  retrieve: (offset: number, limit: number) => Promise<R[]>,
  processor: (row: R) => Promise<P>,
  offset = 0,
  limit = 100,
): Promise<P[]> {
  const records = await retrieve(offset, limit);
  const processed = await Promise.all(records.map(processor));
  if (records.length === limit) {
    return [...processed, ...(await batchProcess(retrieve, processor, offset + limit, limit))];
  }
  return processed;
}

Batched, ordered ETL from tenant DB into platform services with explicit reconciliation passes.

// Per-tenant CronJob: labels for org/tenant, schedule from config, tight job history,
// single completion. Container runs the compiled sync entrypoint (not the HTTP server).
const cronJob = {
  kind: 'CronJob',
  metadata: { labels: { /* org id, tenant name, component: sync-job */ } },
  spec: {
    schedule: config.cron.schedule,
    failedJobsHistoryLimit: 0,
    successfulJobsHistoryLimit: 1,
    startingDeadlineSeconds: 120,
    jobTemplate: {
      spec: {
        completions: 1,
        template: {
          spec: {
            containers: [{
              name: 'sync',
              image: config.job.image,
              command: ['/nodejs/bin/node', '--require=dd-trace/init', 'src/job.js'],
              env: [ /* TENANT_*, DATABASE_*, service hostnames, … */ ],
            }],
            restartPolicy: 'OnFailure',
          },
        },
      },
    },
  },
};

Kubernetes CronJobs created/updated from Node; sync workload runs under dd-trace.

Impact & Metrics

Syncs thousands of records per run with zero data inconsistencies.
Reduces manual reconciliation across production systems, saving hundreds of man-hours.
Minimises urgency to remove legacy system, ensuring that we can continue to deliver new features and de-prioritise rebuilding existing services.
Datadog dashboards provide real-time monitoring of ETL health, sync latency, and error rates.

Skills Demonstrated

Backend system design with Node.js and Koa.
ETL pipeline architecture.
Database design, transactions, and indexing.
Scheduling, automation, and failure recovery.
Observability and production monitoring using Datadog.
Bi-directional data synchronisation and conflict resolution.