Skip to main content

Metrics Pipeline

Tap can ingest developer productivity metrics (from tools like GitHub Copilot and Claude Code) and correlate them with feedback data. This lets organizations see how tool adoption metrics align with qualitative developer sentiment.

How It Works

An admin uploads a metrics file through the Metrics Upload page. Tap processes the file in two phases -- first a preview for validation, then the actual import. Uploaded metrics are linked to developer identities and stored alongside feedback data for correlation analysis.

The diagram below shows the upload and processing flow.

Color key: Purple = User actions, Orange = Backend processing, Green = Database operations

Supported File Formats

FormatSourceFile TypeKey Fields
GitHub CopilotCopilot Business usage exportNDJSONGitHub login, acceptances, suggestions, date
Claude CodeClaude Code usage exportJSONUser email, completions, sessions, date
Generic CSVAny tool with CSV exportCSVConfigurable column mapping

Tap auto-detects the file format based on structure and content, then normalizes all formats into a unified schema before storage.

Two-Phase Pipeline

Phase 1: Preview

The previewUpload function validates the file and returns a summary without writing any data:

  • Validates file extension and size (50MB limit)
  • Detects the file format (Copilot, Claude Code, or generic)
  • Parses a sample of rows
  • Returns row count, detected format, column mapping, and any validation warnings

This lets the admin review what will be imported before committing.

Phase 2: Processing

The uploadMetrics function performs the actual import:

  • Parses the full file using the detected format
  • Normalizes rows into the unified daily_developer_metrics schema
  • Resolves developer identities (matching GitHub logins and emails to canonical records)
  • Inserts metrics rows with proper foreign key relationships
  • Records the ingestion run with status, row counts, and any errors

Identity Resolution

The developer_identities table maps different identifiers (GitHub login, email address) to a single canonical developer record. This is necessary because:

  • Copilot exports use GitHub logins
  • Claude Code exports use email addresses
  • A single developer may appear in both systems

When metrics are ingested, Tap looks up or creates a developer_identities record for each row, ensuring metrics from different tools are attributed to the same person.

Rate Limiting and Validation

ConstraintValue
Upload rate limit10 uploads per hour per organization
Max file size50MB
Allowed extensions.csv, .json, .ndjson
Duplicate handlingUpsert on (developer_identity_id, date) to prevent double-counting

Audit Trail

Every upload is recorded in the metric_ingestion_runs table:

FieldPurpose
file_nameOriginal uploaded filename
statusprocessing, completed, or failed
rows_processedNumber of metrics rows written
rows_erroredNumber of rows that failed validation
error_detailsJSON array of specific row-level errors
organization_idWhich organization uploaded the file
created_atWhen the upload started