Knowledge and Data Integration
Welcome to the homepage of the fall 2020 edition of Knowledge and Data Integration, course of the Data Science degree at the University of Trento.

News


Website online!

September 15th, 2020 07:52

 

 

This class will start on Thursady Sept 17st. More details and Zoom link in the Calendar and Material section.

September 15th, 2020 16:32

 

 

New modalities for evaluation and final exam
(see EXAM section).

October 29th, 2020 17:44

 

 

Last modification: October 29th, 2020 17:44

Instructions


The Fall 2020 Edition of Knowledge and Data Integration is delivered using the Flipped Classroom methodology1, which requires a bit more work on your side during the course (and also on our side), but should yield considerably better results.

In synthesis, Slides and videos are provided for each lesson, the students can find this material on the current web site under the "Calendar & Material" section. Each lesson scheduled in the calendar has the scope to answer the questions, formulated by the students, regarding the material of the lesson itself and eventually on the previous lessons (in the column named "Content of Lessons", are reported the lessons considered for the questions). See the section "Q&A" in order to understand how to formulate the questions. Due to that the students have to read the slides and the related video in advance, in order to formulate the questions before the lessons in which they will be clarified. At the end of each phase of the methodology adopted in the KDI course, there is a General Q&A Lesson in which the students can ask questions over all the phase studied. Note that at the end of each phase, the Q&A session will answer to the questions only on the phase just completed.

 

GitHub
In this edition of KDI course, the students have to work using different features offered by GitHub. One of the objectives of the course is to follow the different phases of the iTelos methodology working on different knowledge and data integration projects. A team composed by a group of students, is assigned to each project, and is in charge to work  on it following the methodology in all its phases. Those teams are present also on GitHub, and they have the responsibility to generate and manage a specific project repository. Due to this every student must have a GitHub account, and she/he has to communicate the GitHub user to the teachers, together with other general background information, using the form here linked :
https://docs.google.com/forms/d/e/1FAIpQLSeTsR7CSqcNuweyJ1rJG00sCrHTjaXZRle_LdyCYbtFkj9ZtQ/viewform?usp=sf_link

Syllabus


Course Objectives and Outcomes

The Knowledge and Data Integration course aims to providing motivations, definitions, theorems and techniques for a concrete and effective understanding of what (in the context of computer science) is meant for knowledge and data. Providing also, the basic techniques for analyzing and modelling knowledge and data as well as the basic techniques for data and knowledge integration. Stimulating the students to continue their career with higher interest into data and knowledge representation in their own field of expertise, and to produce computer-processable solutions of relevant problems.

 

General Description

This course will cover the following topics:
  • a general methodology (iTelos) for knowledge and data analysis, modeling and integration.
  • an analysis of the state of the art tools and methodologies for data analysis, modeling and integration.
  • an introduction to ontologies, Extended ER models and linguistic resources.
This is a hands-on, lab and experiment based course. Students will be given a data analysis/modelling/integration problem that they will have to solve, possibly, while taking the class. During the experiment, students will have to apply to the problem the notions introduced in class.

Teachers


Fausto Giunchiglia
Mauro Dragoni
Alessio Zamboni
Simone Bocca
Mayukh Bagchi
Yamini Chandrashekar
Fausto Giunchiglia
Mauro Dragoni
Alessio Zamboni
Simone Bocca
Mayukh Bagchi
Yamini Chandrashekar
fausto.giunchiglia@unitn.it
dragoni@fbk.eu
alessio.zamboni@unitn.it
simone.bocca@unitn.it
mayukh.bagchi@studenti.unitn.it
yamini.chandrashekar@unitn.it

Calendar and Material


The course runs from Sep, 17, 2020 till Dec 11, 2020 with the following schedule, and accessible with the following Zoom link:

     

  • Thursday, 15:30-17:00

    Link: expired

    ID meeting: expired

    Passcode: expired

  •  

  • Friday, 14:30-16:00

    Link: expired

    ID meeting: expired

    Passcode: expired

 

The above zoom links should be used only for the main lectures. The Q&A and assessment lectures will use zoon links communicated directly by the tutors, via the moodle.

 

You might want to read the Instructions to understand how to take the course.

 

Notice also the titles and structure of the lessons yet to be delivered might change slightly . The rule of the thumb is: if there are links with materials, things won’t change; if there are no links to the materials, titles and content are just suggestions.

 

Lesson Number Date                         Starts at Material                                                    Content of Material Content of Lesson Professor(s)                                  Zoom registration External resources               
0 - 1 17 Sep, 2020 15:30 Slides_1, Video_1 Introduction & Representation Diversity. F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca Lesson
2 18 Sep, 2020 14:30 Slides_1, Video_1, Slides_2, Video_2 Slides_3, Video_3 Solving Representation Diversity. F. Giunchiglia
3 24 Sep, 2020 15:30 Slides_1, Video_1, Slides_2, Video_2 iTelos Methodology Overview. Q&A about lessons 2, 3 F. Giunchiglia Lesson
4 25 Sep, 2020 14:30 Slides_1, Video_1.1 Video_1.2 Projects Management. Q&A about lessons 2, 3, 4 F. Giunchiglia, M. Bagchi, S. Bocca Lesson
5 1 Oct, 2020 15:30 Slides:
projectSlides-FHIR Healt Record,
projectSlides-Covid data integration,
projectSlides-GeoSpatial Domain,
projectSlides-Transportation Domain,
projectSlides-Educational Events,
projectSlides-Helath Facilities,
projectSlides-Tourist Events,
projectSlides-Tourist Facilties
Videos:
projectVideos-FHIR Healt Record,
projectVideos-Covid data integration,
projectVideos-GeoSpatial Domain,
projectVideos-Transportation Domain,
projectVideos-Educational Events,
projectVideos-Helath Facilities,
projectVideos-Tourist Events,
projectVideos-Tourist Facilties
Projects Illustration. Q&A about lessons 3, 4, 5 F. Giunchiglia, M. Bagchi, S. Bocca Project example
6 2 Oct, 2020 14:30 Slides_1, Video_1, Slides_2, Video_2.1, Video_2.2 Theory Scope Definition and Inception Phases. Q&A about lessons 4, 5, 6 M. Dragoni
7 8 Oct, 2020 15:30 Slides_1, Video_1 Practice Scope Definition and Inception Phases. Q&A about lessons 5, 6, 7 M. Dragoni, A. Zamboni
8 9 Oct, 2020 14:30 Q&A General Project plenary session M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
9 15 Oct, 2020 15:30 Q&A Scope Definition and Inception Phases. Project parallel sessions M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
10 16 Oct, 2020 14:30 Slides, Video_1, Video_2 Theory Informal Modeling Phase. Q&A about lessons 9 M. Dragoni
11 22 Oct, 2020 15:30 Slides, Video, Common-Knowledge Practice Informal Modeling Phase. Q&A about lessons 9, 10 M. Dragoni, A. Zamboni Lesson yed-doc,
yed-tutorial,
EER-doc
12 23 Oct, 2020 14:30 Assessment on Scope and Inception Phases. Q&A about lessons 9, 10, 11 F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
13 29 Oct, 2020 15:30 Q&A Informal Modeling Phase. Project parallel sessions F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
14 30 Oct, 2020 14:30 Slides, Video Theory Formal Modeling Phase. Q&A about lessons 13 M. Bagchi, F. Giunchiglia Metadata, Lesson
15 5 Nov, 2020 15:30 slides_1, video_1,
slides_2, video_2,
slides_3, video_3,
slides_4, video_4,
slides_5, video_5,
slides_6, video_6,
Evaluation_Inception_Inf-Modeling_slides, Evaluation_Inception_Inf-Modeling_video,

Demo_Tools_Formal_Modeling_1, Demo_Tools_Formal_Modeling_2
Practice Formal Modeling Phase. Q&A about lessons 13, 14 M. Bagchi, Y. Chandrashekar, F. Giunchiglia, S. Bocca OWL Lite RDF
OWL
Protege
UKC

Excel Import File Example
16 6 Nov, 2020 14:30 Assessment on Informal Modeling Phase. Q&A about lessons 13, 14, 15 F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
17 12 Nov, 2020 15:30 Q&A Formal Modeling Phase. Project parallel sessions F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
18 13 Nov, 2020 14:30 slides_1, video_1,
slides_2, video_2,
slides_3, video_3
Theory Data Integration Phase. Q&A about lessons 17 S. Bocca, F. Giunchiglia
19 19 Nov, 2020 15:30 Karmalinker_slides,
Karmalinker_video,
Karmalinker_demo_1,
Karmalinker_demo_2,

EML_slides,
EML_video,

GraphDB_slides,
GraphDB_video
Practice Data Integration Phase. Q&A about lessons 17, 18 F. Giunchiglia, M. Bagchi, A. Zamboni, S. Bocca GraphDB
20 20 Nov, 2020 14:30 Metadata integration slides
Q&A General. Q&A about lessons 17, 18, 19 F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca Lesson
21 26 Nov, 2020 15:30 Q&A Data Integration Phase. Project parallel sessions F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
22 27 Nov, 2020 14:30 Assessment on Formal Modeling Phase. Q&A about all the program F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
23 3 Dec, 2020 15:30 Q&A General. Q&A about all the program F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
24 4 Dec, 2020 15:30 Evaluation integration lecture Evaluation pocess Q&A about all the program M. Dragoni Lesson
25 17 Dec, 2020 15:00 Exams - Final Assessment Tourist Events,
Covid-19
F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
26 17 Dec, 2020 16:30 Exams - Final Assessment GeoSpatial,
Tourist Facilities
F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
27 18 Dec, 2020 15:00 Exams - Final Assessment Transportation,
FHIR Health Record
F. Giunchiglia, M. Dragoni, A. Zamboni, M. Bagchi, S. Bocca
28 Class evaluation questionnaires Methodology ,

Tools

Exam


The students will be evaluated not only in one single moment at the end of the course but also in different moments (4 times) during the project development. Those evaluation moments are identified by the Assessments after the end of each phase. Below how the evaluation is structured along the assessment and how the final exam will evaluate the teams and the students singularly.

The evaluation provided for each Assessment, plus the evaluation after the final exam presentation, will decide the final vote for each student. While it is the responsibility of the evaluator to make sure that everybody is tested, students are welcome to drive the discussions as they feel it is more convenient.

Assessment evaluation (first 3 phases, last 30 min each)


During each assessment session, the Project Manager (PM) will explain the results regarding the methodology phase just developed. He/She as the manager of the project is in charge of explaining the work of the team, but this doesn’t mean that he/she has to be the only one to present the work. The other components of the team should participate in the discussion describing more specific aspects regarding the Data Scientist (DS), the Knowledge Engineer (KE) and the Domain Expert (DE).

The tutors, in order to evaluate the teams regarding the current phase, will ask questions and try to understand the level of the work done by the group. Moreover the tutors have to ask more specific questions to the members of the group, also not concerning the specific role. For example, in order to understand if the whole group is well integrated, the tutor may ask questions about the data management to the KE, and questions about the schema generation to the DS.

The Assessment evaluation plays an important role for the final grade, because it can estimate the work of each student, and also of each team, during each phase, so it is able to define, in the end the effort level involved in the project development.

Final Exam evaluation (1 hour)


The Final exam, for each project, will be composed by:
  • The presentation (slides) of the whole project. (40 min, Q&A included)
  • The Demo presentation (20 min, Q&A included)
As for the interim evaluations, all the components of the team should participate in the discussion describing their more specific areas of contribution.

After the end of the course, all students will evaluate and produce feedback on the iTelos methodology, strenghts and pitfalls, by filling up an online questionnaire

Questions & Answers


Following the Flipped Classroom methodology, the interactive lessons scheduled in the calendar, aim to discuss the questions on the reletaed topic (as described in the "Instruction" section). Due to that the questions can be formulated up to two days before the date in which they have to be discussed.

How to formulate a Question ?
Each student has the access to the "unitn-kdi-2020-room" github repository (https://github.com/UNITN-KDI-2020/unitn-kdi-2020-room.git), which is used to store general information resources regarding the course and for the Q&A procedure. The students who wants to add a question, have to access the repository and create a new ISSUE.
The question issues will have the following properties:
  • Tittle: question title
  • Body: the question specified as clearly as possible
  • Assignees: the student ("yourself" option on github)
  • Labels: select the label of the lesson relative to the question to add (i.e. the question label for the first lesson is "Question L1")
  • Projects: if the question is related to a specific KDI project, here specify the repository of the respective project. Otherwise "None" value.

Collaboration Opportunities


Multiple positions are available as 150h and internships. They should be considered as the first part of a research project and thesis with the Knowdive group. The general activities of the group are listed on the website (http://knowdive.disi.unitn.it/), while activities already scheduled and available now can be found at http://knowdive.disi.unitn.it/work-with-us/. The 150h activities have variable length and are strictly related to software development: for this reason, knowledge of software development with at least onr programming language is a must. All the activities can also be carried on in a remote fashion.

 

Anyone interested in these opportunities can send an email to knowdive-positions@disi.unitn.it, providing already information about preferences in terms of topics or activities (if known). For 150h activities it is important to provide information about known programming languages with the corresponding level, a value in the range [1 - 5] where 1= basic knowledge, 5= advanced knowledge.

 

The applications to the “150 ore” program can be done at the link:
https://www.unitn.it/servizi/224/collaborazioni-studenti-150-ore
Notice that the deadline for applications for the A.Y 2020-2021 is September 30, 2020