25 Data Catalog/Repository

São Paulo already has a centralized data catalog called Governo Aberto SP (SP Open Government). It represents the efforts of the state's Public Administration to gather into one location information about its public databases, their custodians and features, how to download them from the Web, and their formats. A central data repository is recommended so that citizens do not have to spend hours "gathering" databases from different sites of state agencies. Successful initiatives around the world show that grouping databases in a central catalog is not only recommended from a convenience point of view for citizens, but is also a smart way to measure and monitor the health of public databases available to society.

SP Open Government can stop being a final product that serves as query tool, and start being a platform. We should think about what types of information are entered and what kinds of products may come from this portal, so that backstage integration (e.g., designing scripts to automate the periodic process of publishing databases) becomes part of the server's workflow without disrupting its routine. The portal could also be a means for the government to track which open data is being published and keep it organized and updated.

Among the products that can rely on a well-managed centralized data catalog are dashboards, or dynamic panels. These tools could show the state's performance, using several indicators built from the databases available on the portal. This would be an interesting strategy, since the data used must be open and up-to-date for the dynamic panel to work, creating a virtuous cycle.

The panels can be created according to executive, department, and citizen demands in areas such as crime, finance, health, environment, and transportation. The choice can be made by brainstorming with public employees, managers, and civil society or by a public survey. This information would consist of the argument for development, which could be done through competitions. The government would make the databases and APIs available to programmers and companies, and they would develop the prototype. This relationship between the government and private sectors could stimulate the generation of new business using open data.

There are several tools on the market that can help the Public Administration implement a centralized data catalog. Currently, two that stand out are Socrata, developed by an American company, and CKAN, a free and open source tool maintained by Open Knowledge and a community of developers. Both tools are used around the world in government open data portals and have advantages and disadvantages. It is up to the public manager to examine the technical characteristics to choose one that will best meet the needs and context of São Paulo.