Navigate to the “TOC Jobs” page after logging in as a user
Click “Add New Job”
Click “Choose File” and upload relevant file
At the moment, only *.CSV files are accepted
First row of the file contains column/variable name(s)
The main input variable should not be labelled as “desc” since this is a temporary variable name created during TOC preprocessing
File size cannot exceed 100 MB
Support for additional input file types will be implemented in the future
After uploading the file, click “Add Job to Processing Queue”
In the next page, wait for file validation to run and use the “Select Column with Data to Process:” dropdown menu to choose the variable name of charge descriptions
Click “Set Data Column” to add job to the processing queue and click “Review Jobs” to go back to the “TOC Jobs” page
Once the job status changes from “Not Completed” to “Completed” in the “TOC Jobs” page, an email notification with a link to the “TOC Jobs” page will be sent to the registered email address. In the “TOC Jobs” page, click the “Results” button to download the file
Classification results will be available for 30 calendar days after the date of submission. After the 30 days, the classification results will no longer be available in your user account.
Input Data Specification
As a text classification tool, offense descriptions used as inputs for TOC require English vocabulary to an extent
Examples of acceptable inputs:
“Capital murder”
“35 42 1 1 2 murder”
“poss of methamph”
“Operating a motor vehicle expired regis less than 6 mos”
“DUI w/ one prior and breath alcohol of 15 or greater”
Types of input data to omit in the data:
Case/Citation/Ticket/Warrant number
Statute number
Statutes can be included in the data if it also contains the description of the statute (e.g. “2C:35-10” vs “2C:35-10 CDS/Possession”). However, the result may not be as reliable.
Non-criminal offenses
TOC is primarily intended for classifying criminal offenses. As such, predicted charge codes for descriptions related to immigration law or civil cases will not be reliable and thus omitted from user submission.
Sample Data
Valid
Description
Capital murdr
Operate mv expired regis
2C:35-10 CDS/Possession
public order offense
Single-column data with column name in the first row. Subsequent rows contain text descriptions
Statute
Description
35 42 1 1 2
Murder
2C:35-10
CDS/Possession
35 43 10 3 3
legend drug deception
162205
bail jump i
Multi-column data with column names in the first row. The column that contains the text description of the offense should be selected as the main input variable
Invalid
?
Capital murdr
Operate mv expired regis
2C:35-10 CDS/Possession
public order offense
Single-column data without column name can cause errors
Statute
Case_Number
35 42 1 1 2
CF00001
2C:35-10
CF00002
35 43 10 3 3
CF00003
162205
CF00004
Multi-column data without text descriptions. Although TOC can process such data, the results are unreliable due to lack of English vocabulary