Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

A project to explore the citations, and influence of attorneys across state rulings.

NotificationsYou must be signed in to change notification settings

triztian/caselawcite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An analysis project that inspects citations in rulings from Illinois.

Analysis

As mentioned in the project report we seek to answer the following questions:

  1. Who is the attorney that has had the most participation in cases?, from private parties?, from the government?
  2. How much the work in which an attorney is involved is cited, e.g.how influential was the work.
  3. What is the page count of cases in which an attorney has participated?

Those questions are answered by the following respective Jupyter Notebooks,and the findings presented in the project report:

  1. Q1 Most Influential Attorneys
  2. Q2 Attorney Case Citations
  3. Q3 Average page count for cases

Obtaining the Data

Bulk case data

Bulk case data can be downloaded from the following URL:

mkdir Data&&cd Datacurl https://api.case.law/v1/bulk/22341/download/

Case citations

Citation can be found here:

cd Data/Illinois-20200302-text/datacurl https://case.law/download/citation_graph/2020-04-28/citations.csv.gz

Preparing the Data for analysis

First be sure to install the required Tools as listed here:

After downloading the data into theData directory we can use thepython script included in./ETL/hcapetl.py directory to transform, clean and insertthe data into a SQLite database that will simplify our analysis.

The data must be extracted first with these commands:

DATA=Data/Illinois-20200302-text/dataDPROC=Data/Processedxzcat$DATA/data.jsonl.xz> data.jsonljq -s$DATA/data.jsonl>$DPROC/data.json

The database will be namedhcap.sqlite and it can be created by the followingcommands:

dbpath=./hcap.sqlite./ETL/hcapetl.py create tables"$dbpath" ./Database/*.ddl.sql./ETL/hcapetl.py create attorneys"$dbpath" ./Data/Processed/data.json./ETL/hcapetl.py create cases"$dbpath" ./Data/Processed/data.json./ETL/hcapetl.py create citations"$dbpath" ./Data/Processed/data.json

Running the full ETL pipeline should take about 10 minutes (excluding data download).

The previous commands can be found in thegendata.sh script at the root ofthis project.

TheETL directory has all of the python source necessary to work with the data.To aid with the exploration and cleanup we have the following Jupyter Notebooks:

  1. Attorney Name Parsing
  2. Attorney Record Parsing
  3. Attorney Career Overview
  4. Influential Case Attorneys

TheData Exploration file has information aboutthe commands used to gain insights to fragments of the data and to determinea SQL db schema.

About

A project to explore the citations, and influence of attorneys across state rulings.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp