Universal Dependencies is a multi-lingual treebank in permanent grow that contains annotated language data. I use this dataset to analyze different language parameters.

The current project makes a thorough per-language statistical analysis of token and sentence lengths for 91 languages present in the Version 2.6 dataset.

Visit the project page for a more complete description.