Universal Translator

AI-Powered Inclusive Government Communication

National FinalsGovHack 2024Team Lead

Overview

Universal Translator

AI-powered universal translator that seamlessly bridges communication gaps between government and citizens. This innovative tool ensures that vital...

Agentic Workflows

Explored agentic flows post GovHack to see what can be achieved using agents to better align documents with Federal Plain Language Style Guide. A...

AI Governance

We completed a self assessment using the The NSW AI Assessment Framework during the GovHack period. This sixty page document is an important asset for...

Our Mission

Enhance civic life, drive democratic renewal, and build belonging by fostering social inclusion. We aim to empower all citizens—regardless of language, background, ability, or accessibility needs—to fully participate in society by improving their access to vital information and services.

Additionally, we equip government staff at all levels with AI-powered tools that serve as co-pilots, reducing administrative burdens and improving efficiency in delivering public services.

"We're committed to increasing cultural participation, and helping businesses to innovate, adapt and grow."

— City of Sydney

"Australia is a prosperous, safe and united country. Our inclusive national identity is built around shared values."

— Department of Home Affairs

Universal Translator Solution Overview - AI-powered government communication system

Complete system architecture: RAG pipeline, CrewAI agents, and accessible citizen interface

The Problem We're Solving

Communication Barriers

•Non-English speakers struggle with government services
•Complex legal/bureaucratic language excludes many citizens
•Accessibility needs often unmet (vision, hearing, cognitive)
•Low literacy levels prevent understanding of vital information

Government Challenges

•Staff overwhelmed with translation and simplification requests
•Inconsistent application of style guide standards
•No scalable way to verify AI-generated content accuracy
•Disinformation spreading faster than fact-checking

Our Solution: AI Universal Translator

A trusted and personalised AI tool designed to bridge communication gaps between government and the diverse population it serves. The translator adapts to multiple access channels, focusing on the ones most relevant to citizens, tailored to their context and accessibility needs.

🌐

Language Translation

Translates content into the user's preferred language with cultural context awareness.

📝

Simplified Language

Adjusts reading levels to ensure content is accessible to all literacy levels.

🔊

Text-to-Speech

Converts written content into speech for auditory assistance.

📋

Summarization

Provides concise summaries of complex government documents and information.

✅

Style Guide Compliance

Reviews documents for Australian Style Manual compliance automatically.

🤖

Autonomous Verification

AI agents verify accuracy of generated responses before delivery.

The Application Interface

Citizens interact through a simple chat interface. Behind the scenes, the system uses RAG (Retrieval-Augmented Generation) to pull accurate information from authoritative government sources, then presents it in the user's preferred format and language.

Universal Translator web interface showing chat and translation features

Technical Architecture

RAG Pipeline: From Data to Intelligence

The system uses Retrieval-Augmented Generation (RAG) to ensure responses are grounded in authoritative government data. Unstructured data from official documents and websites is transformed into actionable intelligence.

┌─────────────────────────────────────────────────────────────────────────┐
│                         RAG PIPELINE ARCHITECTURE                         │
└─────────────────────────────────────────────────────────────────────────┘

   Government Sources              Vector Database              LLM Response
   ─────────────────               ───────────────              ────────────
         │                              │                            │
         ▼                              ▼                            ▼
┌─────────────────┐           ┌─────────────────┐           ┌─────────────────┐
│  Data Ingestion │           │    ChromaDB     │           │  LangChain +    │
│  • Websites     │    ───►   │  • Embeddings   │    ───►   │  LLM Generation │
│  • Documents    │           │  • Semantic     │           │  • Verified     │
│  • Style guides │           │    search       │           │    responses    │
└─────────────────┘           └─────────────────┘           └─────────────────┘
         │                              │                            │
         ▼                              ▼                            ▼
   Citizenship data              Context retrieval            Plain language
   Style Manual                  Semantic matching            Multi-language
   Service info                  Relevance scoring            Accessibility

Data Ingestion

Converts unstructured government data into vector database-friendly format for efficient retrieval.

Semantic Search

Identifies semantically equivalent terms for more intuitive query responses across languages.

Contextual Understanding

Provides nuanced responses by understanding context, not just keyword matching.

CrewAI: Autonomous Agent Workflows

We use CrewAI to orchestrate multiple AI agents that work together to validate, verify, and enhance responses. Each agent has a specific role, ensuring accuracy and compliance before any content reaches the user.

CrewAI agent workflow showing verification and validation pipeline

Agent collaboration: Researcher → Writer → Editor → Fact-checker → Publisher

Technology Stack

Built with scalability, flexibility, and accessibility in mind. All components are deployed in Docker containers using docker-compose for easy local development and cloud deployment.

Backend

PythonLangChainLangServeFastAPIChromaDB

Frontend

Vue.jsVuetifyViteNode.jsExpress

AI & Infrastructure

CrewAIDockerOpenAI/GeminiText-to-Speech

Real-World Use Cases

🌏

Non-English Speaker with Local Council

A resident who doesn't speak English can confidently communicate with their local council in their native language. They can ask questions, raise concerns, and receive accurate responses in real time, without needing a translator, ensuring privacy and direct communication.

📄

Simplifying Complex Government Forms

Faced with lengthy, complicated government forms, a citizen uses the system to automatically simplify the documentation into plain, readable language. Key terms and sections are explained in an easy-to-understand manner, helping them navigate the process with confidence.

📚

Accessibility for Low Literacy Levels

An individual with a low reading level uses the tool to automatically simplify a government document. The system breaks down complex language into simpler terms, making it easier for them to understand and engage with essential information.

⏱️

Time-Poor Citizen Receiving Summaries

A busy citizen can quickly receive a summary of important government information or reports. Instead of sifting through detailed documents, the system provides them with concise, clear summaries of the key points, saving them time while keeping them informed.

🕳️

Reporting Issues in Plain Language

A citizen who wants to report a pothole or other issue can easily do so using the tool, which allows them to describe the problem in plain English. The system captures their input and formats it appropriately for submission to the relevant government department.

👩‍💼

Government Employee Ensuring Compliance

A government employee uses the system to review a document they have written, ensuring it aligns with the latest guidelines from the Australian Style Manual. The tool checks for adherence to style, tone, punctuation, and inclusive language.

✅

Data Source Verification

The system reviews incoming information from various sources (social media, public platforms, news outlets) and cross-references it with authoritative and trusted databases to check for accuracy and reliability.

🛡️

Disinformation Detection

Using AI techniques such as Retrieval-Augmented Generation (RAG), the system automatically checks whether information aligns with known, verified facts. If discrepancies are detected, the system flags potential disinformation and provides a summary of conflicting information.

🤝

Increasing Volunteering Through AI

Volunteering rates are declining partly because people are unaware of local needs or find it difficult to understand how they can help. This use case focuses on making volunteer opportunities easier to understand and engage with.

Responsible AI Integration

Recognizing the need for ethical, responsible, and safe AI, we are committed to transparency and public trust. We completed a self-assessment using the NSW AI Assessment Framework—an important asset for any government AI project.

NSW AI Assessment Framework self-assessment

Australia's AI Ethics Principles

✓ Human, societal, and environmental wellbeing
✓ Human-centred values
✓ Fairness
✓ Privacy protection and security

Our Commitments

✓ Transparency and explainability
✓ Contestability
✓ Accountability
✓ Reliability and safety

Transparency Dashboard

A governance layer oversees the AI orchestration engine, monitoring performance metrics related to accessibility, accuracy, and fairness. Both citizens and government agencies can track key outcomes and intervene if needed.

Accuracy

Response quality metrics

Fairness

Bias detection

Accessibility

WCAG compliance

Trust

User feedback

Agentic Flows and Plain Language Style Guide

We have been further exploring agentic flows post GovHack to see what can be achieved using agents to better align documents with the Federal Plain Language Style Guide.

Plain Language Style Guide Agents Diagram

Plain Speaking Agents: AI-Powered Style Guide Implementation

A system of specialised AI agents, built using CrewAI, working together to transform complex government communications into clear, accessible content that adheres to the Australian Government Style Manual guidelines.

A system of specialised AI agents working together to transform complex government communications into clear, accessible content that adheres to the Australian Government Style Manual guidelines.

Core Style Analysis Agents (Parallel Processing)

Style Guide Ingestion Agent

Processes original document
Generates markup-based recommendations
Lists core style elements requiring attention

Inclusivity & Accessibility Agent

Reviews document for accessibility concerns
Provides markup-based recommendations for improvements

Content Structure Agent

analyses document organisation
Provides structural recommendations in markup format

Content Type Agent

Identifies content type
Provides type-specific recommendations in markup format

Reference Attribution Agent

Reviews document sources
Provides attribution recommendations in markup

Central Integration

Integrate Styles Agent

Critical Central Component

Serves as the central consolidation point
Takes input from ALL style analysis agents
Resolves potential conflicts between recommendations
Produces a unified, coherent final document
Ensures consistency across all applied changes

Output Processing

Revision Marker Agent

Compares original and final documents
Creates line-by-line comparison markup

Metadata Agent

Processes final document
Generates JSON metadata file

Summary Agent

Reviews metadata and final document
Produces concise summary

System Tools and Utilities

File Handling Tools

Read file operations for document ingestion
Write file operations for outputs
Support multiple file formats and encodings

Metrics and Costs Tracking

(End-of-process utility)

Calculates total token usage
Provides cost analysis of processing

Future Development

A second crew of agents is being developed to:

Ingest the Australian Government Style Manual website content
Store information in a vector database
Enable enhanced style guidance through RAG

"Making government communication more accessible to all Australians through AI-powered assistance, supporting both content creators and consumers in achieving clearer, more effective communication."

Team Datasets

•
Style Manual for Australian Government: Data ingestion pipeline parses the style manual unstructured data on this site, creates embeddings and stores these in a vector database for later semantic search. This helps ensure that text provided is semantically aligned to the style guide. Data Set →
•
Social Cohesion - Dept of Home Affairs: Data ingestion pipeline parses the data on this site, creates embeddings and stores these in a vector database for later semantic search. This helps ensure that questions on citizenship are being answered based on qualified official sources to minimise hallucinations. Data Set →

Data Story

Unstructured data represents a traditionally difficult data type to take advantage of. It can include text from official government documents, government websites, social media interactions, and more, holds invaluable insights that structured data alone cannot provide.

By tapping into this resource, and using AI techniques we can create a more comprehensive and responsive AI Universal Translator that better meets the needs of all citizens, including the diverse long tail that makes up modern Australia.

Our approach involves several advanced techniques to transform this unstructured data into actionable intelligence:

Retrieval-Augmented Generation (RAG):
- Data Ingestion: Converts unstructured data into a vector database-friendly format.
- Semantic Search: Identifies semantically equivalent terms for more intuitive query responses.
Vector Databases:
- Scalability: Ensures efficient data retrieval as the dataset grows.
- Contextual Understanding: Provides nuanced and precise responses by understanding context.
Governance and Compliance:
- AI Principles Alignment: Ensures fairness, transparency, and accountability.
- Performance Metrics: Tracks accessibility, accuracy, and fairness via a governance dashboard.

Data Quality Recommendations

We conducted an assessment of the data quality from unstructured webpages for RAG ingestion. Our focus was on identifying pages with strong examples of good writing style, as this would enhance the accuracy of AI-generated results.

Recommendation: The Australian government should expand its style guides to include a broader range of examples illustrating both good and poor writing practices. This would provide clearer guidance for AI solutions utilising these guides in a RAG pipeline, ultimately improving the accuracy of semantic search outcomes.

AI In Governance - The Bigger Picture

This section looks to answer some of the strategic questions posed by the Infosys International Challenge on AI Governance.

Boosting Operational Efficiency

In a large agency, scaling adoption commensurate with organisation maturity, risk appetite and experience of AI solutions will be key.

Improving Transparency

By strategically implementing AI and ensuring robust governance, government agencies can significantly enhance transparency, improve public trust, and deliver better services.

Ensuring Ethical Use

To ensure the ethical use of AI applications in the public sector, a comprehensive set of frameworks and guidelines should be established.

Data Privacy and Security

To protect the privacy and security of sensitive data used by AI in public sector applications, a multi-faceted approach involving robust frameworks, continuous monitoring, and proactive measures must be implemented.

Building Public Trust

To build and maintain public trust in AI-driven systems, governments must adopt a transparent, inclusive, and accountable approach throughout the development and deployment processes.

Future Adaptations

Emerging trends and technologies in AI have the potential to significantly enhance public sector efficiency and transparency.

Extending an implementation plan to include AI project elements

Government AI projects differ from typical agile or waterfall software or IT projects in several key ways, primarily because they rely on data and machine learning models to deliver results, instead of following predefined logic.

Key Learnings

RAG is Essential for Trust

LLMs hallucinate. For government services, accuracy isn't optional. RAG grounds responses in authoritative sources, making AI trustworthy enough for civic applications.

"Without retrieval augmentation, we'd just be building a very confident liar."

Agent Verification Changes Everything

CrewAI's multi-agent approach means content is researched, written, edited, and fact-checked before delivery. Each agent catches errors the others miss.

"One agent writes, another questions, another verifies. Collaboration beats solo AI."

Government Data Quality Varies

We assessed data quality from unstructured webpages for RAG ingestion. Style guides need more examples of both good and poor practices to improve AI training.

"Garbage in, garbage out—even with the best AI."

Accessibility Must Be First

Building accessibility in from day one is easier than retrofitting. Text-to-speech, simplified language, and multi-language support should be core features, not add-ons.

"If it doesn't work for everyone, it doesn't work."

Resources

View on GitHub View All Hackathons

"The generation that taught us 'don't talk to strangers' shouldn't be excluded from participating in democracy because of language or accessibility barriers."

— GovHack 2024 National Finals Entry