Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Statistical analysis on the Italian firms

NotificationsYou must be signed in to change notification settings

honestus/AIDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project consists of aStatistical purpose analysis on the Italian firms, it borns with an educational aim for theStatistical Methods for Data Science course(A.Y. 2017-2018)at the Università di Pisa.

We've been analyzing Italian firms by trying to answer very common claims in the economical/statistical field, that are:

  • what's the measure of size that best describes the firms size;
  • quantifying the correlation between different measures of the firms size;
  • how are the firms sizes distributed;
  • how is the firms growth distributed;
  • is the mean growth statistically different from zero;
  • is the growth distribution symmetric or asymmetric.

Then we've also tried to distinguish the behaviours within distinct subsamples of the whole dataset, such as: distinct subsectors, distinct years, distinct firms sizes.

Tools and Technologies used

All the analysis have been done withR (version 3.5.0).

To perform useful operations on our data we've used thedplyr package; for power law distribution we've usedpoweRlaw library. For plotting we've mostly usedggplot package. Any other needed package is listed in packages.txt file.

Files description

A brief description of the distinct directories and files you may find in this repository:

  • thedata directory contains RData files that refer to our original data.
  • thefiles directory contains:
    • distrResults which contains all the RData files for the results of fitted distributions on distinct (sub)samples;
    • images which contains all the images of plotting, CIs etc.
  • utils.R is an R file for very general utilities(eg: loading needed packages, loading datasets into current workspace);
  • functions.R is an R script that contains several useful functions for analysis purposes;
  • first_analysis.R contains a very general analysis on the whole dataset, eg: basic statistics of the distinct features;
  • correlation.R contains correlation analysis and linear regression for Employee and Revenue attributes;
  • test_distr.R contains all the analysis done for Size distribution of the firms;
  • powerlaw.R has been written to further analyze the power law hypothesis on the firms size by using Employee attribute;
  • growth_rate_dist.R and all the remaining files which name starts by "growth"(one for each (sub)sample) contain analysis on the growth of the italian firms;
  • distributionResultsAnalysis contains the results that we've obtained and thus analyzed from files contained in "files/distrResults"
  • packages.txt contains a list of the packages needed to perform the analysis.

For deeper and clearer explanations about the procedures and the results, please read ourfinal report.


[8]ページ先頭

©2009-2025 Movatter.jp