At work, we were questioning about building a collaborative knowledge base with git and jekyll. For this purpose we need entities with properties and relations among them.

I was wondering if Jekyll is an appropriate instruments and how many pages could you handle Jekyll with reasonable building time of the site.

I wrote some code to randomly create mock up data for 700 people and 100 companies. Every person work for a company and companies have zero, one or more employees. We need to show this informations on company and people pages plus some properties.

First attempt

In my first attempt I created one page for every person and every company, I stored every data in in the Front Matter of the pages, for example:

layout: company
id: able
title: Able
founded: 2000
country: Italy
permalink: /companies/able
people: [,, clown.grape]

With this data structure, the liquid code showing the employees of a company is something like that:

{% for person in page.people %}
  {% for apage in site.pages %}
    {% if == person %}
     <li><a href="/people/{{ person }}">{{ apage.title }}</a></li>
    {% endif %}
  {% endfor %}
{% endfor %}

We are cycling on the people array of the page, then cycling on every page of the site in search of the name relative to the employee. With this non-optimized solution, the building process took about 60 seconds. Launching Jekyll with jekyll serve --incremental helps a lot, but the build does recognize the changes in the current page but doesn’t rebuild the referenced page (eg. updating the name of a person change the person page but not the company where the person is working).

Second attempt

I initially thought that every Front Matter data was kept in memory in the build process, but seeing the results I don’t think it anymore. I needed a more fast approach so I decidet to keep a single yml file storing data accessible by key from any page. Thus reducing the number of document that the build process need to read.

In the front matter of pages I am now storing just the id and required parameter

layout: company
id: ability
title: Ability
permalink: /companies/ability.html

I am also keeping a companies.yml file containing:

  name: Ability
  founded: 2004
  country: Spain
  people: [jose.navarro, ricardo.fereira]
  name: Mega corp
  founded: 2001
  country: USA
  people: [john.brown, alice.steven, bob.mckenzie]


Now the liquid code showing employees of a company becomes:

{% assign company =[] %}
{% for personid in company.people %}
  {% assign person =[personid] %}
  <li><a href="{{ site.baseurl }}/people/{{ personid }}.html">{{ person.fullname }}</a></li>
{% endfor %}

With this changes the site complete regenaration takes just 3 secs on a Mac Book Pro.


You can have a look at the test web site and his open source code

