Digital Impact Evaluation Platforms for Social Programs
Transforming how we measure what works in international development through AI-powered, end-to-end platforms for randomized controlled trials
See the paper
The North Star Goal
Drastically cut the cost and time required to get credible evidence on what social programs work
By turning what used to be bespoke, consulting-heavy RCT projects into repeatable, software-driven workflows, more NGOs and governments could afford to test and scale only interventions that prove effective. For example, if simple trials could be run 70–90% faster and cheaper, many more organizations would run them regularly instead of relying on guesswork.
Market Analysis Overview
This field centers on digitally enabled platforms for impact evaluations, especially randomized controlled trials (RCTs), in international development and social programs. It's essentially the tech counterpart to traditional Monitoring & Evaluation (M&E) – think of it as building a "Castor/REDCap for social science" that streamlines running policy experiments (RCTs) in the field.
My Unfair Advantage
Network Access
Very good connections with dozens of founders who use these technologies and have to spend large sums of money to get this work done. Much easier for me to contact them and gain access to the data and the people (humans) so that I can create something really quickly. (Ambitious Impact - access to 100s founders / Founders Pledge)
Direct Experience
I was the CTO of an organization in India that would have literally loved to have this kind of tool, and I have the local connections to deploy something very quickly.
End-to-End Platform Scope
01
Designing the evaluation and randomly assigning participants
The RCT setup phase
02
Digital tools for data collection
Surveys via mobile apps, SMS, etc.
03
Participant engagement for program delivery
Sending reminders or interventions via WhatsApp/SMS, obtaining consent digitally
04
Monitoring and case management
Tracking who is in treatment/control groups, ensuring compliance
05
Outcome measurement
Survey results, administrative data integration
06
Analysis and reporting
Dashboards, statistical analysis of impact
Industry Context
This field overlaps with Monitoring & Evaluation (M&E) tech, impact measurement, and policy experiment platforms. It's a niche within gov-tech and ed-tech/health-tech where the focus is on evaluation. It's also informed by practices in clinical trial software (electronic data capture, etc.), but applied to social programs rather than pharmaceuticals.
Market Size: The Numbers
$212.1B
Official foreign aid in 2024
Total global development assistance
3-10%
M&E budget allocation
Typical percentage of project budgets
$6-21B
Annual M&E market
Just in donor-funded programs
Organizations implementing social programs (NGOs, governments, donors) typically allocate about 3%–10% of project budgets to monitoring and evaluation. With official foreign aid alone at $212.1 billion in 2024, that implies a $6–$21 billion per year market just in donor-funded M&E. This doesn't even count domestic government programs and philanthropy.
Volume of RCTs: Massive Demand
J-PAL Network
2,300+ randomized evaluations in 99 countries by its affiliates
Innovations for Poverty Action
1,000+ projects in over 60 countries
The push for evidence-based policy has led to thousands of impact evaluations. A comprehensive repository recorded 2,645 RCT-based impact evaluations (vs ~1,615 quasi-experiments) up to 2015, and the total has only grown since. This volume shows strong demand for running trials, though most have been executed with bespoke tools and lots of manpower.
The Cost Problem
Traditional RCTs are expensive and slow
Often mid-six to seven figures in cost and taking 1-3 years for results. For example, an RCT for a nonprofit in Nigeria required a $2.1 million grant just to complete the endline evaluation. Many large-scale evaluations (including program implementation) can run into millions of dollars.
Key Cost Drivers
Hiring and training enumerators
Field staff recruitment and capacity building
Travel to remote areas
Transportation and logistics costs
Lengthy field surveys
Time-intensive data collection
Managing data
Cleaning, processing, and analysis
These costs mean smaller organizations usually can't afford rigorous evaluations, pointing to a market need for cheaper solutions.
Rising Evidence Requirements
Funding agencies and governments increasingly demand proof of impact. There are policy pushes for "what works" evidence – e.g. the US and others have sponsored low-cost RCT competitions to encourage cheaper evaluations. Many large donors (USAID, World Bank, foundations) now have evaluation policies expecting rigorous impact measurement for major projects.
"With aid budgets under strain, optimising the effectiveness of available aid is paramount" - OECD
This creates pressure (and budget availability) for solutions that can deliver credible results faster.
Low Penetration = Opportunity
Despite the need, the "tech" in this space is still emerging. Much of the $6–20B in M&E spending goes to services and labor (consultants, survey firms, research staff) rather than to scalable software. The existing tools only partially address the workflow.
This suggests a greenfield opportunity to convert inefficient services into a product – if the product can truly cut costs.
Why Now? Technology Convergence
Several trends converge to make now an especially opportune moment to launch a startup in digital impact-evaluation platforms:
AI and Data Science Revolution
Recent advances in AI can attack longstanding bottlenecks in evaluations. For example, data cleaning and analysis typically eat up 80% of an evaluator's time – but modern AI tools (from automated data validation to ML-based analysis) can accelerate this.
Generative AI (LLMs) can help draft surveys in local languages, summarize open-ended responses, and even generate first-pass evaluation reports. Meanwhile, specialized causal ML libraries (like Microsoft's EconML or PyWhy) help analysts glean deeper insights (heterogeneous treatment effects, forecasts) much faster than before.
AI Impact on Analysis Speed
Months → Days
An "AI-native" impact platform could turn months of number-crunching into days, making real-time or continuous evaluation feasible. This is a huge "why now" – the tech to automate analysis was simply not there a few years ago.
Digital Communication Ubiquity
The spread of mobile and messaging tech in emerging markets unlocks new ways to run trials at scale. Five years ago, reaching thousands of rural households might mean dispatching armies of surveyors; today, over 2 billion people use WhatsApp, and even basic phones can get SMS/IVR calls.
Platforms like WhatsApp (through providers like Turn.io) or SMS/voice services (Viamo, engageSPARK) can automate everything from participant recruitment and consent via text, to delivering interventions (e.g. reminder messages or educational content), to low-cost follow-up surveys.
Communication Platform Examples
WhatsApp Business
Turn.io provides platform for organizations to use WhatsApp for engagement, popular for helplines and chatbots
SMS/Voice Services
Viamo and engageSPARK enable automated messaging and surveys via basic phones
RapidPro
UNICEF's open-source platform enabling national-scale two-way SMS programs in 40+ countries
Administrative Data Integration
Governments worldwide are digitizing their data (health records, education databases, etc.), often via systems like DHIS2 (used by 70+ countries as a national health information system). This means measuring outcomes no longer always requires expensive bespoke surveys – you can sometimes pull vaccination records, test scores, or welfare payments directly (with permission).
A modern platform that integrates with government data systems can run "low-cost RCTs" by using existing data as outcomes.
COVID-Accelerated Remote Operations
The pandemic forced NGOs and researchers to adopt remote monitoring and data collection. Phone surveys, SMS check-ins, and online dashboards became normalized. Organizations saw that remote methods can work (IPA, for instance, shifted to phone-based surveys and found ways to maintain data quality).
This has opened minds to running RCTs with far less physical field presence, making a digital platform more acceptable to customers now than pre-2020. Essentially, the user behavior barrier is lower – clients are actively seeking tools to do work remotely and efficiently.
Success of Analogous Platforms
In clinical research, digital platforms like Castor EDC have proven that complex trials can be managed with user-friendly software at scale. Castor's platform now supports 15,000+ studies across 90+ countries, with 147,000+ users including top pharma companies.
This provides a proof point to investors and customers: if clinical trials (which have strict regulatory and data requirements) can be largely digitized, we can do the same for social experiments.
Digital Public Goods Investment
Donors and governments have begun funding "digital public goods" in this area (e.g., UNICEF's investments in open-source tools), indicating now is a moment of attention on tech for social impact.
In short, AI + mobile + data + mindset shifts make it possible to slash RCT costs in 2025 in a way that was not possible before – creating a strong "why now" to build a venture here.
Existing Solutions Landscape
Rather than a single category, this ecosystem spans several types of tools and services. A winning startup might combine features or integrate with many of these. Here's a map of key players (incumbent solutions and recent startups), grouped by function:
M&E and Results Management Platforms
These are software used by NGOs/donors to track program metrics, logframes, and evaluations (not RCT-specific, but could be extended for trial management).
DevResults
A widely used monitoring & evaluation web platform for development projects. It's like an operating system for donor program data – tracking indicators, mapping results geographically, and generating reports. Popular with USAID missions and NGOs for managing routine M&E.
Vera Solutions – Amp Impact
An M&E module built on Salesforce (mostly targeted at NGOs, foundations, and impact investors already using Salesforce CRM). Vera Solutions has implemented Amp Impact for 30+ clients managing over $12.5B in programs across 150+ countries.
More M&E Platform Players
TolaData
A lighter-weight SaaS for M&E used by nonprofits. Focuses on simplifying logframe setup, indicator tracking, and integrating with survey tools. Has ~17 employees and about $1.9M annual revenue as of 2025.
ActivityInfo
A database platform originally for humanitarian coordination (tracking activities and indicators, often in UN humanitarian clusters). It supports offline data entry and flexible form definitions.
DHIS2
The de facto government health data platform in over 70 countries. Ministries of Health use DHIS2 to log everything from clinic visits to disease surveillance.
Digital Data Collection Tools (CAPI/CATI)
These are the survey platforms for field research. They are critical, since any impact evaluation needs to gather baseline and endline data (if not using admin data). RCT-specific needs include the ability to randomize assignments or question order, handle longitudinal tracking, and possibly integrate with phone surveys.
Key Data Collection Players
SurveyCTO
A robust, enterprise-grade platform built on the Open Data Kit (ODK) standard. It offers both CAPI (enumerator tablet data entry) and CATI (phone survey) modes, strong form logic capabilities, and even the ability to do small randomizations within a form.
KoBoToolbox
A free/open-source survey tool popular among humanitarian and NGO users. It's essentially a friendly wrapper around ODK as well, with a cloud service. KoBo supports some randomization features.
ODK Collect/Central
The original open-source toolkit for mobile data collection. Many variants (like SurveyCTO, KoBo) are based on it. ODK itself now has an official server (Central) and the Collect mobile app.
Additional Data Collection Tools
Dimagi CommCare
A case management oriented data collection app. Originally for community health workers to track patients over time, it also supports surveys with logic and has ways to randomize. It shines for longitudinal data and managing a list of cases (people/households).
Magpi
A simpler mobile data collection tool which also offers SMS and IVR modules. It's been used for quick surveys and small projects where you need to send automated calls, etc.
Any new venture will likely not build a better survey app from scratch (that's a crowded space), but rather connect these tools into a larger workflow.
Participant Engagement & Program Delivery
Once participants are randomized in an RCT, the intervention often needs to be delivered or at least interactions maintained. Traditionally, this meant in-person services. Now, a number of platforms enable digital service delivery or messaging at scale, which can be harnessed for RCT treatments and ongoing monitoring:
Engagement Platform Leaders
RapidPro
An open-source platform (from UNICEF) for designing and deploying messaging flows via SMS, IVR (interactive voice calls), WhatsApp, and more. It has a visual flow builder and supports two-way interactions. Used in 36+ countries by governments and NGOs.
Turn.io
A social impact-oriented company that is an official WhatsApp Business Solution Provider (BSP). It provides a platform for organizations to use WhatsApp for engagement (popular for helplines, chatbots for health info, etc.).
Viamo
Focused on voice (IVR) and SMS content in local languages. Known for their 3-2-1 service – an on-demand info hotline via IVR in many countries. They also run outbound call campaigns.
Why Engagement Platforms Matter
Participant engagement is a major cost in trials. Enumerator phone calls or visits are expensive; by using these digital rails, you cut those costs dramatically. They also enable scale – you can reach tens of thousands of people simultaneously, which is hard with manual methods.
A modern RCT platform would likely incorporate these channels (either via integration or built-in modules) to automate many interactions that used to require human labor. It's also where AI can play – e.g., using AI chatbots to handle participant Q&A or using text-to-speech in local dialects for IVR content.
Field Operations & Data Collection Services
These are not software products but networks of people or innovative data sources. They are crucial because a platform alone doesn't replace the need for trustworthy data from the field. Partnering with or copying aspects of these can give a startup a moat in execution:
Research Network Organizations
Innovations for Poverty Action (IPA)
A nonprofit research outfit with operations in 17 country offices across Africa, Asia, and Latin America. They have run 1,000+ studies and have teams of enumerators, surveyors, and project managers. They are the gold standard for fieldwork quality in RCTs.
J-PAL
A global network linked to MIT, which coordinates researchers and some implementation of RCTs. They have training programs and policy outreach. Similar to IPA – a network to tap for pilot projects.
IDinsight, Busara Center, Laterite
These are newer research/analytics organizations that marry consulting with rigorous methods. They often partner directly with governments in Africa/Asia. They might be both competitors and partners.
Innovative Data Collection Services
GeoPoll
A venture-backed firm with a large database of mobile phone panel respondents across emerging markets. They can send SMS surveys or conduct call surveys very quickly, leveraging pre-recruited respondents.
Premise Data
Another innovative approach – a smartphone app that pays locals to collect data (photos, observations, short surveys). It's like "Uber for data collection" in 120+ countries.
60 Decibels
A startup specializing in "Lean Data" phone surveys for impact measurement. They have a global call network and boast of turning around surveys in a few weeks, focusing on customer or beneficiary feedback.
Partnership Strategy Insight
These networks are potential partners rather than competitors. The hardest part of running trials in diverse, challenging environments is the "last mile" human element – finding the right respondents, translating questions, ensuring data quality.
A savvy startup would partner with these organizations (or enable a marketplace of local service providers) instead of trying to build its own presence in dozens of countries from scratch. This addresses investors' concern that the business could be "geo-constrained"; by leveraging existing networks, the startup focuses on the software orchestration and leaves local execution to those who do it best.
Identity & Tracking Technology
One specific challenge in longitudinal studies (trials that track people over months/years) is attrition and duplicate identities. If participants can't be reliably re-contacted or identified, the data suffers.
Simprints
A tech nonprofit providing biometric ID (fingerprint and face scanning via mobile) for use in developing country programs. They have worked with large NGOs and governments to ensure, for instance, that each child vaccination is logged to the same child, or that cash transfer recipients aren't enrolling twice.
Digital ID and CRVS systems
Many countries are rolling out national digital ID or using phone numbers as quasi-ID. Also, civil registration systems (for births, deaths) are improving. These can be tapped to trace outcomes.
Data Analysis & Randomization Tools
These are mostly open-source libraries and guidelines, but they form the methodology backbone that any platform must embed.
DIME (Development Impact Evaluation) Wiki – World Bank
A rich knowledge base of how to do various randomization techniques, power calculations, managing spillovers, etc. Having these best practices built into the platform's design is crucial.
DeclareDesign / randomizr (R packages)
These allow programmatic specification of experiments and random assignment with reproducibility. They could be the back-end for an "assignment" module.
Causal ML libraries
e.g., Microsoft's EconML, PyTorch-based DoWhy, Uber's CausalML for uplift modeling. These can help with advanced analysis like figuring out for whom a program worked best.
Clinical Trials Tech Analogs
We briefly mentioned Castor EDC. To elaborate on why analogs matter:
Castor EDC
Offers a one-stop electronic data capture with e-consent, patient surveys (ePRO), randomization management (IWRS), etc., for clinical studies. It is used in thousands of studies and has grown rapidly with a for-profit, SaaS model. They've raised venture funding by promising to make clinical trials faster and more inclusive.
This is instructive: the social program RCT space today is like clinical research 10-15 years ago – ready to be transformed by software.
REDCap
A widely-used free data capture tool in academia (with >2 million users globally). While not a commercial company, its adoption proves the demand for flexible electronic forms and databases for research.
Notable Newcomers & Innovations
Finally, some recent startups or projects using new tech in this domain, which could be templates to emulate or at least learn from:
Sopact (Impact Cloud / Sopact "Sense")
A startup offering an impact measurement platform with AI features. They advertise building rigorous impact evaluations in "weeks, not years" and integrating qualitative data, surveys, etc., with AI-driven analysis.
60 Decibels & Lean Data
60 Decibels uses technology (mobile surveys) plus a network to dramatically speed up outcome data collection (getting results in weeks). Clients care about speed – a lean, tech-enabled approach that delivers actionable data quickly can win out.
Behavioral Nudges at Scale
Organizations like the Behavioural Insights Team (BIT) and ideas42 have been doing "A/B tests" in public services at large scale (sometimes millions of people) by sending different messages or forms.
Competitive Landscape Summary
No single player currently provides the full end-to-end RCT workflow as a product. Instead, there's a patchwork of tools and organizations, each excellent at a piece of the puzzle.
A new startup can position itself as the integrator and orchestrator, combining these capabilities into one seamless experience. The moat will come from how well it stitches them together (and the data that flows through it).
Evidence of Customer Demand
The fact that many of the above solutions have paying users validates that organizations spend money to solve M&E problems:
DevResults, TolaData, etc. are selling SaaS licenses
Confirming a budget for software that improves reporting. TolaData making ~$1.9M ARR with a 17-person team suggests dozens of midsize NGO clients find value in it.
SurveyCTO charges fees while free alternatives exist
Implying that reliability and support are worth paying for in critical data collection.
Companies like Viamo and GeoPoll succeed on services model
Meaning clients pay per survey or per call to get data – a product that internalizes those capabilities could instead capture that spend as SaaS revenue.
Monetization Approaches
1
SaaS licenses for organizations
Annual subscription based on number of projects, users, or respondents managed. Many M&E tools follow this model. Key is pricing it within typical M&E budget percentages.
2
Usage-based fees
Like communications costs pass-through with markup, or per survey respondent fees. EngageSPARK, GeoPoll etc. use usage pricing. This aligns revenue with scale of the evaluation.
3
Marketplace commission
If the platform connects clients with field partners, it could take a commission on that contract. This is analogous to Upwork or other marketplaces and would monetize the coordination role.
4
Premium services
Offering expert support, customization, or data science consulting on top of the platform could be a revenue stream (especially in early years as product matures).
Profitability Examples
It's worth noting that pure "social sector tech" can be challenging for venture capital if growth is limited by donor budgets and fragmentation. However, some firms have navigated this:
Vera Solutions
Has grown via a hybrid consulting/software model and got recognition from Salesforce – suggesting profitability through services and strategic partnerships.
60 Decibels
Raised impact investment and thrives as a niche data provider.
Premise Data
Raised more traditional VC by also targeting commercial clients (market research etc.).
Market Size Opportunity
If a platform captured even 1% of global M&E budgets
It's a ~$100M/year revenue opportunity. The key is proving you can unlock that spend by delivering value (nobody will pay for just fancy dashboards if they still have to hire lots of people – the platform must truly replace or streamline expensive steps).
For a new startup here, one approach is initial grant funding or impact investment to build the product (since it has clear social benefit), and then scaling revenue once it demonstrates cost savings. Major development donors have innovation funds (USAID, World Bank, Gates Foundation) that have previously funded digital platforms – this could be non-dilutive capital.
Concrete Demand Signals
Low-Cost RCT Competition
The Low-Cost RCT Competition by a US coalition (backed by foundations) explicitly looked for ways to do RCTs cheaper. This implies end customers (like government agencies) are eager for affordable evaluation methods.
Government "nudge units"
Governments setting up "nudge units" (BIT-style) indicates they want to run iterative experiments themselves. They will need tools to manage those experiments – better than just Excel.
Academic researcher adoption
Academic researchers in development economics, who often lead RCTs, have started to use more software. A platform that makes their research easier might get rapid adoption in the research community.
Adaptive learning focus
We also see donor consortia focusing on "adaptive learning" – essentially continuously evaluating and tweaking programs. They will be shopping for tools to enable that.
Gaps in Current Solutions
It's instructive to note the gaps, because gaps = opportunity:
M&E platforms focus on monitoring, not impact evaluation
Existing M&E platforms (DevResults, etc.) focus on monitoring (tracking indicators) rather than impact evaluation. They don't have built-in randomization or causal analysis. Users still have to jump out to Stata/R for analysis.
No deep integration of random assignment
None of the incumbents have deeply integrated the random assignment and statistical inference piece.
Manual effort still dominates
Traditional evaluations rely heavily on manual effort (for survey, for cleaning data). Newer tech-enabled approaches (remote surveys, automated analytics) are still not mainstream in big institutions.
Strategic Moats and Differentiators
If launching a company in this space, a few strategic elements could create a moat and make it a VC-backable, high-impact venture:
End-to-End Orchestration Moat
Currently, an organization might use 5+ different tools to execute an RCT (Excel for sampling, SurveyCTO for data, WhatsApp manually, etc.). This leads to lots of integration pain and errors.
If your startup offers one integrated platform where users can do everything (design → sample → randomize → deliver intervention → collect outcome data → see analysis), that "single pane of glass" is a huge selling point.
Imagine a dashboard where at any time the project manager can see enrollment numbers, data completeness, and preliminary impact results updating live – that's currently very hard to do without a lot of manual wrangling.
Data Network Effects
Over time, the platform could accumulate valuable datasets – e.g., directories of participants or communities that have been part of past studies, or benchmarks on cost per outcome across interventions.
If (with proper consent) the platform can reuse sampling frames or contact networks, it could, for example, help a new user quickly find a sample of farmers in Kenya because a previous project already engaged some. Or it could flag "typical attrition rate for similar projects is 20%" to help planning.
These kinds of data-driven features improve as more users/projects join, making the service better and better – a virtuous cycle that new competitors would struggle to replicate.
Local Partnerships at Scale
As discussed, partnering with IPA, J-PAL, and local survey firms creates a semi-network moat. If your platform becomes the go-to for, say, all IPA projects in a region, new entrants face the hurdle of prying away an entire network that is trained and integrated.
One could also formalize this into a certified partner program, where local firms get trained on the platform and perhaps earn referrals. The effect is to embed the startup into the ecosystem such that to use someone else would be inconvenient.
It's similar to how Salesforce built a moat with its vast consultant network – everyone is trained on it, so it's hard to switch.
Government Integration Moat
Getting your software officially integrated or endorsed in government data stacks (like a health ministry allowing your platform to plug into DHIS2 or a ministry of education using it for their evaluations) can lead to long-term contracts and high stickiness.
Governments move slowly but once they adopt a system, they keep it. If "RCT platform" features become part of how a ministry runs pilot programs, you could be looking at multi-year, nation-scale engagements.
There's also a trust element: being able to say "we comply with government data protection and we have a ready connector to your databases" smooths adoption in large bureaucracies.
AI/Automation Differentiation
If your platform truly leverages AI to cut out busywork (e.g., automating data cleaning, using NLP to analyze open feedback, or even auto-generating first drafts of reports), it will stand out in a sector where much is still manual Excel work.
This could allow a 10x improvement in productivity (for instance, an evaluator can manage 5 trials at once instead of one, because so many processes are automated). Such step-change improvements drive word-of-mouth.
Early success stories might be: "We normally spend 6 months and $300k to get results – with this platform we had reliable data in 6 weeks at a fraction of cost."
Challenges to Consider
It's important to be realistic – this field does have barriers:
Localization and Human Element
Running an impact study in rural India is very different from one in Kenya or Brazil. Language, culture, infrastructure vary. A platform must be flexible enough to handle offline modes, multi-language (including script direction issues), and allow human oversight where needed.
Conservative Stakeholders
Large donors and governments can be risk-averse. They might ask, "has this been used in a gold-standard study?" You may need some early flagship trials (perhaps in partnership with known entities like World Bank or J-PAL) to prove credibility.
Competition or DIY Solutions
There's always the risk that bigger actors (say, the World Bank or a big NGO federation) might try to build their own internal platform. To stay ahead, a startup must move faster and innovate (especially with AI).
Go-to-Market Considerations
Selling to NGOs and governments means longer sales cycles and a need to navigate procurement. One approach is to target the funders (who often can mandate tools for their grantees) – e.g., if a big foundation likes your platform, they might require all projects they fund to use it (this has happened with tools like DevResults in some cases).
Another approach is to have a free community edition that individual researchers and nonprofits start using (bottom-up adoption), then upsell institutional features to the organizations.
Key Ideas Missing from Analysis
Several important strategic considerations that weren't fully covered in the main analysis:
Reframe from "reporting" to real-time intelligence
M&E is largely treated as a compliance cost center; the opportunity is to turn it into a causal engine that predicts what will work, for whom, and at what cost (HTE, policy targeting, impact forecasting).
Non-discretionary budgets
Major donors mandate 3–10%+ for M&E; procurement is anchored in project/portfolio line items rather than seats. This enables value-based, project-aligned pricing.
The "missing middle" gap
Market bifurcates into heavy enterprise "systems of record" vs. point tools. There's a clear opening for an integrated, self-serve, end-to-end RCT workflow platform.
Value Chain Integration Opportunity
The costly friction isn't tools per se but stitching steps: design → sampling/consent → randomization → intervention delivery → data capture → data cleaning/integration across systems → analysis/reporting.
Strong signal of demand for native API connectors (e.g., SurveyCTO ingest, DHIS2 sync, RapidPro/Turn.io triggers). The "integration tax" is where costs hide and where a unified platform creates the most value.
AI-Native Feature Set
Why now: bottleneck has shifted to causal analysis. Data collection is commoditized; Causal ML (EconML/DoWhy) + LLMs make automated ATE/HTE, qualitative NLP, and report generation feasible. Moat = workflow, UX, integrations—not proprietary algorithms.
One-click HTE
Automated heterogeneous treatment effects analysis
NLP over qualitative data
Themes, sentiment, summaries from open-ended responses
Predictive impact forecasting
In-silico policy experiments and impact prediction
Auto-generated narratives
Automated report generation and dashboards
Product Blueprint: "Castor for Social Science"
The recommended path: built-in eConsent, IWRS-grade randomization (stratified/block), unified data layer (SurveyCTO + DHIS2), and an embedded causal engine.
Study Builder
Design evaluation with built-in best practices
Stratified Randomization
IWRS-grade assignment with proper concealment
Unified Data Layer
SurveyCTO + DHIS2 integration
Causal Engine
Automated analysis and reporting
Initial Target & MVP Scope
Target Research Ops teams (IPA, J-PAL, IDinsight, Busara, academic labs). MVP = the "setup & analysis loop": study builder → stratified randomization → 1-click SurveyCTO ingest → 1-click DML/HTE dashboard.
Go-to-market: Platform + Partner. Seed credibility and distribution by partnering with IPA/J-PAL (discounted pilots) and riding their networks; integration-first with SurveyCTO, RapidPro/Turn.io, and DHIS2 for administrative outcomes.
Pricing and Moats Strategy
Pricing Strategy
Price per active study / participant bands (project-aligned) with freemium for small academic projects to drive PLG; expand within orgs as studies scale.
Durable Moats
Workflow stickiness (platform becomes operational backbone), data network effects (benchmarks, attrition risk models, reusable frames), proprietary trained pipelines on accumulated social-program data, and deep government integrations.
Conclusion: Strong Opportunity
The research and data suggest a strong opportunity. There is a big problem (high cost and slow speed of impact evaluations) in a multi-billion dollar sector that is increasingly aware of the problem. Existing players solve pieces but not the whole, and technology (AI, mobile, cloud) now offers a path to solve it in a transformative way.
The timing is ripe with the "impact evaluation" zeitgeist and digital transformation in the social sector. A startup that "copy-pastes" what works from others and brings it into one coherent product could deliver an order-of-magnitude improvement.
If successful, such a platform not only has profit potential (capturing part of that $6-$21B/year spend), but also massive social impact: enabling hundreds more experiments to be run, so effective programs reach millions more people and ineffective ones are scaled down. This dual ROI (financial and social) could attract mission-aligned investors and talent, giving the startup an edge.
This analysis was prepared by Hugo Walrand. Its content is proprietary and may not be replicated, distributed, or used without his express written consent.