Suggestions for Theses & Projects

The assignment of each topic for individual project, bachelor thesis and diploma thesis is only approximate and can be changed for each topic after more detailed specification. It is also possible to adjust the type on demand (bachelor thesis <-> individual project etc.) If any of the topics interests you, contact one of our members or Ph.D. students.

Linked Data and Graph Databases

RDF is a format for publishing data on the future web in a machine readable form. As part of the OpenData initiative, many sources with unclear semantics, such as HTML pages, are converted to their machine readable equivalent in RDF format. One of the important domains of the OpenData activity is the domain of public procurement.

  • Prototype implementation of a secure, distributed government data information system based on Linked Data principles

    diploma thesis, software project

    Personal information sharing with the government today is inefficient. Governmental institutions do not share data and also they are unable to get structured personal data electronically in a secure way. A new distributed system could work in another way. People would be responsible for storing their personal data securely on the internet, governmental institutions could ask for permissions to get certain subsets of the data. Therefore, one could publish personal data changes only once in his data space and all subscribed governmental institutions would receive updates immediately and securely. This could minimize the costs of governmental IT systems and the effort citizens need to give to manage their data stored within governmental databases. The goal of this project is to implement a prototype of this system including secure information storage, subscribe requests and secure update notifications.

  • Extending the Linking Open Data (LOD) Cloud with new datasets

    individual project, bachelor thesis, diploma thesis

    LOD Cloud

    Find potentially valuable datasets that are not yet in the LOD Cloud and do not have a license prohibiting users to manipulate the datasets, tranform them into RDF, link them to other appropriate datasets from the Linking Open Data (LOD) Cloud and publish them as Linked Open Data so that they become parts of the next Linking Open Data (LOD) Cloud version. Get to know the tools for modern data manipulaton. A solution for keeping the dataset up-to-date should be part of the work.

  • Public procurement information system

    individual project, bachelor thesis

    Design and implement a web application for management of information about public procurement. The application must be easily deployable on a website of, for example, a city. It will have a form part for data entry according to an ontology for public procurement. The data will be displayed enriched by RDFa. The application shoud be able to save the data both locally and to a remote storage using web services so the city can decide whether it will host the data itself or store it in a remote (central) storage.

  • Intelligent Scraper for Open Data Initiative

    diploma thesis

    The goal of the thesis is to bootstrap the OpenData initiative by designing and implementing (X)HTML intelligent scraper for scraping (not only) public contracts details from the Web. Based on the simple configuration file with CSS Selectors, we define which data (and on which pages) should be scraped - the scraper will collect the data and store it as RDF data; during this process the scraper will be learning which data occur at what location on the page, so that the scraper can (with a reasonable precision) scrape the other (X)HTML pages not described by any configuration file. The learning set of pages needed to train the scraper should be as small as possible and the precision of the scraper as high as possible, but still with a reasonable recall.

  • Extraction and Analysis of PDF Files Associated with Public Procurement

    individual project, bachelor thesis

    The goal of this work will be to extract the PDF data containing public contracts specifications (using appropriate library, such as PDFBox), supplement it with the semantics and store it as RDF data. The extraction should be driven by a configuration file, as generic as possible.

  • HTML+RDFa editor

    individual project, bachelor thesis

    Design and implement a simple HTML+RDFa editor. It will highlight HTML and RDFa syntax and automatically display RDF triples found in a document. The editor will download specifications of used RDF vocabularies and will suggest URIs and CURIEs according to the edit context in RDFa attributes. The editor will offer basic validation according to used RDF vocabularies. Supported W3C standards: XHTML+RDFa 1.0, 1.1, HTML5+RDFa.

The advent of Linked Data in the recent years accelerates the evolution of the Web into a giant information space where the unprecedented volume of resources will offer to the information consumer a level of information integration and aggregation that has up to now not been possible. Consumers can now 'mashup' and readily integrate information for use in a myriad of alternative end uses. Indiscriminate addition of information can, however, come with inherent problems such as (1) the provision of poor quality, (2) inaccurate, (3) irrelevant, or (4) fraudulent information. All will come with an associate cost which will ultimately affect decision making, system usage and uptake.

The ability to assess the quality of information on the Web, thus, presents one of the most important aspects of the information integration on the Web and will play a fundamental role in the continued adoption of Linked Data principles.

  • GRIAN - Quality Assessment Framework for Linked Data on the Web

    software project

    The goal of the project is to design and implement Grian - a quality assessment framework for Linked Data, which will involve the quality assessment (QA) process helping the information consumer to discover and avoid the problems (1 - 4) above.

Modeling XML and Evolution

Topics in this section are possible extensions, enhancments and follow-up projects of our tool for conceptual modeling of XML - eXolutio. The tool uses UML together with XML technologies.

  • Generation of XML schemas

    individual project - bachelor thesis - diploma thesis

    Generating XSD from the model in eXolutio is a core functionality. The tool in its current state provides this feature, but the form of the generated XSD is fixed. The aim of this work is to provide means to the user in order to give him a more detailed control over the form of the generated XSD.

    The preferred way to achieve this is to enrich the current model with a custom UML profile and use this profile to annotate the schemas. From the annotation, the algorithm determines a certain form of translation.

    Variants of this topic: using Relax NG or DTD istead of XML Schema.

  • Reverse Engineering of XML schemas

    individual project - bachelor thesis

    Algorithm for translating eXolutio diagrams into XML schemas (psm2xsd) is a part of eXolutio. The aim of this work is to design and implement a reverse engineering algorithm (xsd2psm for XML Schema language or rng2psm for Relax NG language).

  • Multi-user support

    bachelor thesis

    In its current state, eXolutio is a single user tool. The aim of this work is to add support for editing the same projects by several user in the same time on different work stations connected via network.

  • XMI interface for eXolutio

    individual project - bachelor thesis - diploma thesis

    The aim of this work is to formalize our extension of UML using UML stereotypes and work on integration with other comercial tools (Enterprise Architect, Power Designer...), mainly to implement the ability to load XMI model into an eXolutio project and store the eXolutio project in the form of XMI document.

  • Modeling WSDL

    individual project - bachelor thesis

    Diagrams in eXolutio model XML schemas. One of the motivations was to model schemas referenced from web services. The aim of this work will be to add support for modeling the entire interfaces of web services. Part of the work will be to implement algorithm translating the service interface model into WSDL definition.

  • eXolutio View enhancements

    individual project

    Tha aim of this work is to analyze user requirements regarding the view part and user interface of the eXolutio tool and the implement the results of the analysis. Some requirements have already emerged, e.g. smarter layouting of both UML and tree diagrams.

  • Support for producing documentation

    individual project

    eXolutio can be used during creating the XML specification, but can also be used to complement the documentation of the specification. The aim of this work is to implement functions for automatic generation of documentation from the eXolutio project (targeting HTML and PDF as output). The author should find inspiratoin in the existing tools used to generate documentation, such as Sandcastle, Ant and Enterprise Architect.

  • Modeling Ontologies

    diploma thesis

    eXolutio uses UML for the platform-independent model. Tha aim of this work is to examine the possibilities of using ontologies and translation to OWL language.

  • Query modeling

    diploma thesis

    eXolutio is capable of modeling data. The aim of this work is to examine the possibilities of modeling queries (maybe mapped to UML operations).

  • Automatic derivation of PSM schemas in conceptual modeling for XML

    diploma thesis

    eXolutio was designed to model XML formats mapped to UML models. The aim of this work is to further enhance this feature and suggest an initial deriviaton of the XML format based on the initial requiremnts of the user.

  • PIM views

    diploma thesis

    Analyze the problematics of views on the conceptual levels (a view is a separete conceptual diagram tightly mapped to the main diagram in the project, the mapping can be 1:1 between the view and a subset of the main diagram or a more general M:N).

  • Use eXolutio for modeling real world standards and use cases

    bachelor thesis

  • Using OMG MOF QVT for the purposes of XML schema modeling

    Examing the language MOF QVT and use the language and suggest a formal bindings between the models used by eXolutio. Suggest required extensions of the language.