Editing
Kotahi data supply
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Background== Kotahi provides drops of Kotahi data for us to use for reporting/analysis at or around 12am daily. ==Detail== An S3 target has been created at <code>s3://kpt-kotahi</code>. Karo have permission to push data here. Data arrives every day around midnight, following naming convention of prefix: <code>karo_upload<timestamp></code>, with individual datasets underneath it. 30+ files are provided currently, and these - more or less - each relate to a database table. They are all 'complete' datasets per each load, with the exception of <code>activities</code> which is deltas only from prior export. Files are provided in CSV format. == Data Transfer == The diagram below outlines the Kotahi data load process, showing how files are ingested, transformed, deduplicated, and written into the reporting schema for use in analytics and dashboards. [[File:Kotahi_load.drawio_(3).png|851x851px]] ==Datasets== Table below summarises supplied data: {| class="wikitable sortable" ! Name || Description || Notes |- |- | Activities || Logs of client interactions, including type, date, duration, and additional details stored in CustomJSONData.<ref name="CustomJSONDataNote" /> Note || The CustomJSONData field requires proper escaping of quotes within strings. For example, <code>""\hello""\</code> inside a string should be escaped as <code>\\\"hello\\\"</code>. Ensure data is validated and fixed before import to avoid parsing issues. |- | ActivityTypes || Categorises activities/forms, e.g.,"Narrative Case Notes" "Requisition Approval" |- | AssignedUsers || Tracks which users are assigned to specific clients. |- | ClientCorrespondence || Records communications between clients and providers. |- | ClientIwi || Links clients to their associated iwi. Not in use for now. |- | ClientRelationshipParticipants || Defines roles in client relationships, like "Parent" or "Sibling" . |- | ClientRelationshipTypes || Defines valid relationship types between participants, such as "Parent-Child" or "Siblings". |- | ClientRelationships || Tracks relationships between clients (e.g., parent-child). |- | ClientServiceStatus || Tracks the status of a client in a service, e.g., "Active" or "Inactive".<ref name="ClientServiceStatusNote" /> |- | ClientTasks || Stores tasks assigned to clients, including due dates and statuses. |- | Clients || Core client data, including personal details like name and DOB. |- | CorrespondenceType || Defines types of communication |- | CustomActivityFieldsData || Stores custom data fields linked to specific activities. |- | Dhbs || Lists district health boards, their regions, and their respective ID. |- | Ethnicities || Catalogues ethnicity options for demographic data. |- | Iwi || Lists specific iwi (tribes) associated with clients or users. |- | IwiGroupings || Groups iwi into broader categories. |- | MessageTemplates || Predefined message templates for correspondence. |- | Providers || Details about service providers, e.g., "Tui Medical" or "Dr Jane Doe" || * Where the column '''Type''' = 0: General Practitioner. * Where the column '''Type''' = 1: Provider. |- | ServiceStatus || Defines statuses for services, e.g., "New Referral" or "Completed." <ref name="ClientServiceStatusNote" /> |- | Services || Information about available services, including their names and descriptions. |- | UserLocalities || Tracks the geographic areas assigned to specific users. |- | UserServices || Links users to the services they are involved with. |- | Users || Stores user details, including names, roles, and contact information. |- | Hauora data || Data about hauora goal plans || Hauora goal plans are currently not being utilised at time of writing. |- | Calendar data || Appointments data including schedules, availability, and events || Includes calendars, schedule entries, event details, and linked clients |- | Group activities data || Group activities data including details about participants and staff etc. |} <references> <ref name="CustomJSONDataNote">The CustomJSONData field in Activities stores flexible, structured data in JSON format. It is used for capturing additional details such as specific requests, health goals, or screening statuses, etc. enabling the retrieval of custom attributes.</ref> <ref name="ClientServiceStatusNote">Although the ClientServiceStatus table tracks a client's service status (e.g., "Active," "Inactive"), we are using the latest "Activity Date" on Power BI to dynamically determine activity status, addressing cases where statuses are stuck on "New Referral" for example</ref> </references> ===Materialized Views for Reporting=== ====Technical Implementation==== The following materialized views have been created in the <code>kotahi</code> schema to support reporting requirements: {| class="wikitable sortable" ! Materialized View !! Purpose !! Refresh Schedule |- | <code>mvw_activities_summary</code> || Details of kotahi activities|| 00:45 daily |- | <code>mvw_client_services</code> || Shows active/inactive service relationships with status history || 00:50 daily |- | <code>mvw_client_summary</code> || Core client demographics + household/location metadata || 00:55 daily |- | <code>mvw_tupaia_report</code> || Kotahi data relating to tupaia service || 01:05 daily |- | <code>mvw_tupaia_stages</code> || Tracks client progression through service stages || 01:15 daily |- | <code>mvw_tupaia_stages_counts</code> || Count of tupaia clients in each stage of tupaia pathways || 01:25 daily |} ===Cron Job Configuration=== Refresh schedules are managed via PostgreSQL's <code>pg_cron</code> extension: <syntaxhighlight lang="sql"> -- Scheduled between 00:45-01:25 daily SELECT cron.schedule('45 0 * * *', $$REFRESH MATERIALIZED VIEW kotahi.mvw_activities_summary$$); SELECT cron.schedule('50 0 * * *', $$REFRESH MATERIALIZED VIEW kotahi.mvw_client_services$$); SELECT cron.schedule('55 0 * * *', $$REFRESH MATERIALIZED VIEW kotahi.mvw_client_summary$$); SELECT cron.schedule('5 1 * * *', $$REFRESH MATERIALIZED VIEW kotahi.mvw_tupaia_report$$); SELECT cron.schedule('15 1 * * *', $$REFRESH MATERIALIZED VIEW kotahi.mvw_tupaia_stages$$); SELECT cron.schedule('25 1 * * *', $$REFRESH MATERIALIZED VIEW kotahi.mvw_tupaia_stages_counts$$); </syntaxhighlight> ===JSON data=== The <code>customfielddatajson</code> field contains the contents of any activity. Obviously this varies between activities, and over time (if activities are modified). We will likely need some kind of mapping document to keep track of field changes and/or multiple fields which we need to perform analysis/calculations on.<ref>For example, financial summaries or client/aiga goal plans which span different services.</ref> A fairly nice way to unnest JSON for specific tasks can be done via: <syntaxhighlight lang="sql" line> select distinct a.clientid, j."RequestDate", j."RequestProduct1", j."RequestAmountTotal" from kotahi.activities a, jsonb_to_record(customfielddatajson) as j( "RequestDate" date, "RequestProduct1" text, "RequestAmountTotal" float ) where activitytypeid = 27 --Whanau ora request submission β</syntaxhighlight> ==Pinnacle Workplan== ===Current state=== Database schema and target objects built (staging/rpt) Staging tables being manually loaded weekly ===TO DO=== * Automation of data pickup and load to staging ** StepFunction: ** S3 trigger --> ** Lambda invocation --> ** Write data to staging ** Truncate/reload to rpt ** <code>Activities</code> needs to be merged/upserted * Considerations: ** Audit data (inc. diagnostics where applicable) ** Error logging ** Point in time reporting ** Exclude confidential narratives ===Required analysis objects in <code>rpt</code>=== Propose use of materialized views (?) to build objects showing us the following: * Family views ** Location ** Household size ** Demographic/descriptive info * Link to Indici<ref>Relies on NHI lkup in Indici.</ref> ** Cross-reference Indici<>Kotahi data.<ref>Where is this useful/valuable?</ref> * Point in time ** How many cases on this date ** Family risk/need over time ** Rate of change in risk/need over time * Goal plan analysis ** Goals/needs met ** Mapping of service-level goals/domains to KPT or other wellbeing domains * Duration/effort ** Multiple staff/ multiple clients ** Activity duration ** Effort/duration and link to goal completion or change in risk/need ADD MORE HERE!! ==References== [[Category:Kotahi]]
Summary:
Please note that all contributions to Kautepedia are considered to be released under the Creative Commons Attribution-NonCommercial-ShareAlike (see
Kautepedia:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
British English
Not logged in
Talk
Contributions
Log in
Namespaces
Page
Discussion
British English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information