FIND THE Labels THAT MATTER

Label your data quickly and clearly with Quickcode, a platform built to help you
create valuable training data from your most sensitive text.

WHY quickcode

Quickly and clearly label the training data you need to teach and tune your machine-learning models. Quickcode is a label recommender built with patent-pending methodologies that allow your experts to quickly prepare training data for machine learning and other modeling from your text. We’ve helped some of the largest corporations, federal agencies, and universities. How can we help you?

Control your data.

Quickcode is designed to work on your cloud with your data under your rules, allowing you to safeguard sensitive, proprietary, or regulated information.

Leverage your experts.

Your experts’ time is valuable. You don’t want to take them offline for a week to label data. Paired with Quickcode, experts can label thousands of documents in less than a hour.

Iterate quickly.

Decision makers need models built on today’s data, not yesterday’s. Quickcode gives your data science team a way to quickly get fresh training data to tune their models. And they won’t need to tie up your subject matter experts each time they need to make an update.

Keep it simple.

Our platform is powered by sophisticated methods, built on years of research conducted by our team of data scientists and engineers. We offer an intuitive experience so you can focus on applying your expertise, fast.

quickcode features

  • We integrate with your existing workflow.

  • We make it easy to get your text in and your labels out.

  • We work on short and long documents and everything in between.

  • We can handle humor, rhetoric, slang, code, and any terms unique to your industry.

  • We can recommend labels in any language, with expertise in Spanish, Chinese, Arabic, and Russian.

  • We can host on our cloud or install on yours.

WHO WE HELP

Challenge

How do you quickly turn texts into labels for your machine learning?

Your manager and clients want you to use machine learning to predict an outcome. You know how to tackle the structured data you have, but what to do with the column of text data (e.g. customer comments, technician notes)? What if the outcome itself needs to be coded up based on the text? You need to quickly and accurately code documents into categories but you don’t have the time and resources to read and code thousands of documents.

Solution

Thresher will help you quickly code documents by suggesting keywords to build accurate and precise queries to classify documents.

Example

A data scientist used Thresher ‘Quick Code’ mode to create a query in less than 15 minutes that classified more than 5,500 SMS messages as spam or not spam. This classifier had 95% accuracy relative to human coders. Furthermore, the posts classified by the Thresher-built query were used to train a machine learning model to predict future spam texts. The model had similar performance to a model trained with human coded labels. The ‘Quick Code’ approach was accurate and fast. But in addition the data scientist found it valuable because:

1) They could use the query to easily explain to their managers what words and numbers were commonly found in spam text.
2) When spammers changed their patterns, the data scientist iterated again with Thresher’s ‘Quick Code’ to update the query and prediction models, which helped the team stay abreast of the changes in spam tactics.

Challenge

How do you quickly search and label vast quantities of data for new insights into healthcare, while at the same time explaining your data labeling decisions to others?

You’ve identified a factor that could dramatically affect your patients’ well-being, and have thousands of records that might help test your hypothesis.

But there are too many documents for you to review and label on your own. Even if you could, it would take time to explain your categorization system to others or to change it on the fly if new hypotheses emerge.

Solution

Quickcode helps you speedily identify healthcare documents relevant to your interests and explain your data labeling approach to your peers.

Example

A data scientist used Quickcode to rapidly identify a subset of medical patients and transparently explain their labeling process to others. The data scientist began with a hypothesis that familial or social support improved patient outcomes. They then searched 50,000 discharge summaries made by healthcare providers with a single-word query: “social.” Using Quickcode, the scientist then selected 66 recommended labels, leading to the discovery of 33,210 relevant documents in less than 15 minutes. Quickcode subsequently allowed the data scientist to transparently discuss the validity of their data labeling decisions with subject matter experts and supervisors and select more precise labels based on their input.

Challenge

How do you train and refine a machine learning model to identify complaints from your customers?

You want to use a machine learning application to correctly identify and route messages from your customers to the relevant departments. But how do you teach your model to sort customer messages appropriately? Even if you are able to build a classification system, how can you easily explain your labelling criteria to supervisors and others?

Solution

Use Quickcode to create labeled training data that can be used to train machine learning models while also providing the transparency needed to discuss and refine your work with others.

Example

A user wanted to train a model to recognize complaints of cyber theft. The user started with a single-word query — hack — which they used to search 160,000 customer messages collected from over 3,000 financial services. Using Quickcode, they then iterated through the recommended labels and expanded their training data set in less than 10 minutes to more than 50 times as many complaints. The expanded data set also had more than 10 times as many affected financial institutions than the original set, providing a robust selection of training data with which to build predictive models. The labels also provided transparency, allowing the user to discuss their data labeling decisions with supervisors and adjust based on their input.

Challenge

Are agencies making mission-critical decisions with incomplete data?

Government data scientists, analysts, lawyers, and researchers share many of the same challenges as their private sector counterparts. But some of their needs are different. Their analysis, models, and predictions shape policies that affect citizens’ lives and inform national security. When the stakes are this high, you need to have the most complete data set possible.

Solution

Thresher helps your agency’s data experts quickly and transparently curate data sets for analysis, modeling, and prediction. And they can use Quickcode with their data on their cloud. Better words mean better data and better data mean better predictions.

Examples of Thresher’s Support of Agencies’ Missions 

1) Finding codewords to better understand sensitive online conversations
2) Categorizing writings about suicide bombings and domestic violence
3) Labelling foreign language texts by dialect for better sentiment analysis
4) Creating labels from the slang used to talk about drugs and human trafficking online

Security First

Thresher was built from the ground up with security in-mind. We work in the most sensitive environments across intelligence, defense, and civilian government agencies.

·      Install Quickcode on-premise behind your firewall
·      Leverage Quickcode in the cloud through Amazon GovCloud
·      Compliance with FISMA and NIST standard protocols.

We are proud recipients of contracts from the DARPA-sponsored Small Business Innovation Research (SBIR) program. Their support is an important part of our broader commitment to continuous innovation and rigorous testing of our core technologies. 

Working With Government Agencies

Thresher is a U.S. Small Business Administration (SBA)-certified small business with a robust federal partnering ecosystem. Contact us today to get started with a proof of concept or pilot program.

We believe that combining what computers do best with what experts do best creates sharp insight.

READ OUR STORY