Introduction to Information Systems Project

Back to the main page

Page outline

Introduction

Goals

The main goal of the project is, of course, to practice the techniques that you will see in the lecture. We have tried to organize the project in such a way that it resembles the projects you may have later in your life. This means that there is a schedule to stick to and you will need to work together with other teams. Last but not least the project should be fun! During the last week of the semester you get the opportunity to prove your economic skills in a game. (Yes, there will be a winner! :-)

Approach

You will work in teams of three people that you organize yourselves. (In case of modulus problem we allow a maximum of two teams of two people.) There will be different phases for which you have to hand in a report. There is also a schedule that both helps you divide up your work and is necessary for synchronization between the teams.

Every team must have a website where the reports are published. These websites become particularly important for the final part of the project -- the game.

Organization

Your team

Philippe Cudré-Mauroux: Responsible for the course in general. Contact him for general inquiries and feedback.

Student assistants:

Communication

We offer the following communication channels:

Please read the documentation available on the course web site before asking questions. If you find errors, broken links or have any suggestions, please let us know!

As you will have noticed by now the predominant language in this course is English. You can also write in French if you prefer, but we really encourage you to take this opportunity to practice your English as a preparation for the Master cycle, which is almost 100% in English. Finally you can also write in German.

Grading

The grade you achieve in the project will count 50% of your final grade.

The project grade will be calculated as the weighted average of the three project phases. Every phase will be graded according to the grading matrices that you can find in the phase descriptions.

There may be the possibility of receiving a bonus on your project grade. Details will be announced later.

Please note that the game results do not influence your grade, which does not mean, however, that there won't be a certain incentive. :-)

Finally, here is the mathematical formula that will be used to compute your grade:

project_grade = min(6, 0.4 * phase1 + 0.3 * phase2 + 0.3 * phase3 + bonus)

For more information please read the details of the grading process carefully.

Documentation & Links

We strongly encourage you to consult the available documentation. Many problems can be resolved very quickly as soon as you are a little familiar with the documents.

Also, try to use Google Groups as often as possible. Google Groups indexes an archive of millions of newsgroup messages and it is very likely that other people have had the same problem before you. For instance, if you get an error message you don't understand, simply try entering it as is (possibly between double quotes). Often this method is even faster than opening a book or browsing through documentation!

How-to's

There is a number of tutorials on the techniques and infrastructure you need during the project. Please read them before you start the project!

General

Martin has written a rather detailed summary of last year's IIS course. It may come in handy for certain parts of the project. (Please report errors if you find any.)

IIS Summary

An archive of last year's newsgroup content can be found here:

epfl.ic.cours.IIS (Archive 2004)

Java related

Database servers and tools

Regular expressions

Books

Programs

Infrastructure

Hardware

We have 6 machines at the LSIR at our disposition:

The machines have Intel Pentium 4 CPUs at 2.4 GHz, are equipped with 2 GB of RAM, and are running Linux (Fedora Core 3) with kernel 2.6.10. We also have two backup machines available, just in case anything should happen – you know Murphy's law. Please note that, during the first week of April (the Easter break), the machines may be down for some (short) time. The reason is that the room that contains the servers will get a new and more powerful cooling system.

Because we have no physical login we will be using secure shell to log into the machines. You will receive user names and passwords by e-mail as soon as the teams are made up (i.e. at the beginning of week 2).

To login you can use any EPFL computer that has an SSH client. If you connect from home you have to use the EPFL VPN client because of the firewall that seals off the EPFL network. The SUN stations in INF1 and INF3 have a command line client that you can use by entering the following in a terminal:

ssh userXX@lsir-cis-pcY

where userXX is the user name that you have been given and Y the machine number assigned to your team.

Please also read the How to use your account tutorial.

Software

Advanced students only :-)

Below is a list of some possible alternatives that you may want to use for personal preferences. While we will try to do our best to support you if you do, we cannot guarantee anything. If you're in doubt stick to the suggested combination above or feel free to come talk to us.

The Project

The project is centered around music, music CDs to be more specific. You will gather different sorts of data about music CDs and try to combine them to form a comprehensive database. In a first phase each team chooses one of three categories of web sites and write a program ("web crawler") that collects interesting data. You then form consortia of three teams, each with a different category of data, and combine the data. This database will be accessible via a browse and search web interface to all other teams. In the final phase each team prepares for a bidding game. You develop and set up a web service that allows other teams to place bids for music CDs. Every team has a certain amount of (purely virtual :-) money that they can spend on bidding for CDs. This means that everybody is a supplier and a client at the same time.

Teams

The list of teams and the accounts they have been assigned is available here (only from EPFL or via VPN):

Team list

And here is the list of the consortia:

Consortium list

The address lists for the web sites containing the deliverables can be consulted here:

Phase 1

Keywords:

All the details

Phase 2

Keywords:

All the details

Phase 3

Keywords:

All the details

Game

The bidding game concludes the project. You get to apply the web services you have developed in phase 3 and there's something to win. :-)

All the details

Schedule

Overview

The following table gives you a quick overview of the semester. The project is divided into three main phases. During the last two weeks the game will take place and you will have enough time to finish your final reports.

Week Date Tasks
1 11.3. Build teams
Look at the data source web sites and pick a category
2 18.3. Start phase 1
Form consortia and enroll for a category
Familiarize yourself with the infrastructure
Pick a web site as data source
3 25.3.
[1]
Design database schema
Implement the database
1.4. Enjoy the Easter break! :-)
4 8.4. Implement the database
Develop web crawler & gather data
5 15.4. Develop web crawler & gather data
6 22.4. Develop web crawler & gather data
7 29.4. Start phase 2
Exchange gathered data
Merge data received from consortium
8 6.5. Merge data received from consortium
Develop web pages for browsing/searching data
9 13.5. Develop web pages for browsing/searching data
10 20.5. Start phase 3
Develop web services for the game
11 27.5. Develop web services for the game
12 3.5. Develop web services for the game
13 10.6. Finalize phase 3
14 17.6. Game

[1] Friday 25.3. is a public holiday, so there is no course that day. For those of you who still want to get some work done that week (highly recommended) we will arrange a special Q&A session, probably on Thursday 24.3. 1400–1500. Please see the forum if you're interested.

Special dates and deadlines

Important dates and deadlines concerning the project can be found below. For those of you who like to burn the midnight oil: Unless specified otherwise deadlines are at midnight. :-)

Please make sure you meet the deadlines as not doing so will influence your grades. When you have to hand in something you will get a confirmation via e-mail. This means that if you do not receive confirmation within 24h after handing in something, there may have been a problem and you should try to contact us and/or resend your files.

2005-03-14, Mon Enrol teams (details)
2005-03-16, Wed Each team receives their account data to the servers
2005-03-18, Fri Form consortia (enrolment during the exercise session, 1015-1200)
2005-04-22, Fri Deadline phase 1 (hand-in by e-mail to Martin)
2005-04-27, Wed Exchange of gathered data within consortia (self-organized)
2005-05-19, Thu Deadline phase 2 (hand-in by e-mail to Martin before 2400)
2005-06-10, Fri Deadline phase 3 (hand-in by e-mail to Patrik)
2005-06-13, Mon Start of the game (1200)
2005-06-15, Wed End of the game (1800)
2005-06-17, Fri Presentation of the game results

Miscellanea

Problems

If you encounter problems (of technical or human nature), please don't hesitate to talk to us -- problems are best resolved early. Feel free to come by during the office hours, use the newsgroup, or send us e-mails.

Misconduct

This is the section that shouldn't have to be here.

The short version: Any form of cheating is inacceptable.

The long version: Copying partial or full programs from other teams will not be tolerated and result in a grade of 0 (in words: zero) for the corresponding phase for the copying group. Important: If we cannot determine who has copied from whom, both teams will get a zero grade. This implies that every team must secure their work (e.g. using chmod 700 for directories and chmod 600 for files) and their databases (no blank passwords, etc.). You should also avoid uploading your reports/web pages too early before the deadline. If you want you can protect the pages with a password and include it with the report you send by e-mail.

Note that we do in principle support code reuse and collaboration. This means that you don't have to reinvent the wheel. If you find a library that does exactly what you need for a given subtask, you are allowed (even encouraged) to use it as long as this happens "in a reasonable way". In this case, you must mention this fact in the report and include the source where you've found it. If you are in doubt about whether you are allowed to use a given piece of software, please ask us.

As far as collaboration between teams is concerned, we encourage you to help each other. By helping we mean "show how to do something", not "do somebody else's work". Of course, if you come up with a great idea together with another team, that's perfectly okay. Again in such a case, please include a short note with your report. Oh, and just in case: We still have last year's reports and we're not afraid to compare. :-)

To sum it up: Be reasonable and play fair!

Copyright © Martin Rubli & Patrik Bless – Last change:
This page uses valid XHTML 1.0 Strict and valid Cascading Style Sheets, Level 2. This page uses valid XHTML 1.0 Strict. This page uses valid Cascading Style Sheets, Level 2.