Wiki·reference·reference/audit-summary.md

Audit Summary

Notes from the initial data audit (Step 1). Full report lives at AUDIT.md at the project root.

Corpus

  • 152 CSV files · 2.8 GB total · 10.5M data rows
  • 152 tables documented in the DDL (6,631 column metadata entries, operator-supplied)
  • 92 semantic relationships between 90 entity types (rises to ~124 after our manual node additions)
  • 1,025 aircraft in the fleet roster, across 125 distinct models

Seed strategy

The DB load (db/ramco_aviation.sqlite, 605 MB) was filtered to the top-20 aircraft by activity score plus full masters. This captures ~86% of all real operational activity while keeping the DB small enough to ship.

Top of fleet

RankRegModelTech logsDiscrepancies
11132B767-2001,0401,919
2101A310840862
36YJMBA320-211320981
41133B767-200467593
51819A320-211485466

Empty tables (15)

DP_CFWDDTL_MEL_PART_CF_DTL, DP_DPMRDA_DOC_ATTACH_DTL, DP_PRMSCH_PARAM_SCHEDULE_DTL, FLOG_DFDPRSL_DEF_DISC_RESOL, FLOG_TLGCR_TECH_LOG_CR_DTL, FLOG_TLGDA_DOC_ATTACH_DTL, FLOG_TLGFIN_FINANCE_INFO, FLOG_TLGWU_WORK_UNIT_DTL, SWO_Error_Map, swo_msgprt_upd_dtl, swo_swocrgh_charge_details_his, swo_swopm_param_upd_dtl, swo_swoprth_part_details_his, swo_swores_swo_res_det, swo_visit_plan_eo_dtl.

Most are audit/history/attachment tables the operator doesn't use or didn't export.

Data-quality flags

  • 36 orphan aircraft regs — regs appearing in transactions but not in AC_ACI_AIRCRAFT_INFO. Mostly case-variant duplicates (VT-666 vs Vt-666) or test records.
  • Wildcard doc numbers — some DISCP_DOC_NO = '%' rows exist; filtered out by the Brain at query time.
  • Case-sensitive reg columns — handled via seed case-normalisation during load.

See Data Quality and Case Sensitivity in Rules Reference for how the Brain handles these.

See also