Introduction to Information Systems Project
Page outline
Introduction
Goals
The main goal of the project is, of course, to practice the techniques that you will see in the lecture. We have tried to organize the project in such a way that it resembles the projects you may have later in your life. This means that there is a schedule to stick to and you will need to work together with other teams. Last but not least the project should be fun! During the last week of the semester you get the opportunity to prove your economic skills in a game. (Yes, there will be a winner! :-)
Approach
You will work in teams of three people that you organize yourselves. (In case of modulus problem we allow a maximum of two teams of two people.) There will be different phases for which you have to hand in a report. There is also a schedule that both helps you divide up your work and is necessary for synchronization between the teams.
Every team must have a website where the reports are published. These websites become particularly important for the final part of the project -- the game.
Organization
Your team
Philippe Cudré-Mauroux: Responsible for the course in general. Contact him for general inquiries and feedback.
Student assistants:
Martin Rubli: Project manager. Responsible for the general organization of the project. Feel free to contact him with general feedback, questions, and complaints.
Alexander Goos
Etienne Jodoin
Hui-Yang He
Karim Mardambey
Communication
We offer the following communication channels:
-
Website: All announcements will be made here on this page, so please check it regularly. We will try our best to ensure that important events are announced a reasonable amount of time in advance.
-
Exercise sessions: Fridays, 1015–1200 in rooms INF1 and INF3
The assistants will be there to respond to your questions. -
Newsgroup: The course has a newsgroup (epfl.ic.cours.IIS) that can be reached in the following two ways:
You can find more information on how to access the EPFL newsgroups at the SIC.
Please use the newsgroup instead of e-mail whenever possible. The simple reason is that your questions (and, more important, the answers) will be accessible to everybody.
-
Office hours: See main page or contact us by e-mail
-
E-mail: For certain parts of the project (like handing in the reports) we will use e-mail. To make sorting the messages a little easier for us, please add [iis] to the subject.
Here are our e-mail addresses:
martin.rubli at epfl.ch
alexander.goos at epfl.ch
etienne.jodoin at epfl.ch
hui-yang.he at epfl.ch
karim.mardambey at epfl.ch
We will take the liberty of forwarding messages of general interest to the newsgroup.
Please read the documentation available on the course web site before asking questions. If you find errors, broken links or have any suggestions, please let us know!
As you will have noticed by now the predominant language in this course is English. You can also write in French if you prefer, but we really encourage you to take this opportunity to practice your English as a preparation for the Master cycle, which is almost 100% in English. Finally you can also write in German.
Grading
The grade you achieve in the project will count 50% of your final grade.
The project grade will be calculated as the weighted average of the three project phases. Every phase will be graded according to the grading matrices that you can find in the phase descriptions.
There may be the possibility of receiving a bonus on your project grade. Details will be announced later.
Please note that the game results do not influence your grade, which does not mean, however, that there won't be a certain incentive. :-)
Finally, here is the mathematical formula that will be used to compute your grade:
project_grade = min(6, 0.4 * phase1 + 0.3 * phase2 + 0.3 * phase3 + bonus)
For more information please read the details of the grading process carefully.
Documentation & Links
We strongly encourage you to consult the available documentation. Many problems can be resolved very quickly as soon as you are a little familiar with the documents.
Also, try to use Google Groups as often as possible. Google Groups indexes an archive of millions of newsgroup messages and it is very likely that other people have had the same problem before you. For instance, if you get an error message you don't understand, simply try entering it as is (possibly between double quotes). Often this method is even faster than opening a book or browsing through documentation!
How-to's
There is a number of tutorials on the techniques and infrastructure you need during the project. Please read them before you start the project!
General
Martin has written a rather detailed summary of last year's IIS course. It may come in handy for certain parts of the project. (Please report errors if you find any.)
An archive of last year's newsgroup content can be found here:
epfl.ic.cours.IIS (Archive 2004)
Java related
- Java 2 Platform, Standard Edition (API and miscellaneous documentation)
- java.util.regex API
- Java 2 Platform, Enterprise Edition (J2EE) 1.4
- The J2EE 1.4 Tutorial
- Java System Application Server Administration Guide
- JavaServer Pages Standard Tag Library (JSTL) API
- Apache Ant Manual
Database servers and tools
- PostgreSQL 7.4 Documentation (this is also a very good SQL reference)
- PostgreSQL JDBC Interface Documentation
- MySQL Reference Manual
- pgAdmin III (a graphical interface for Postgres)
Regular expressions
- Starmerj's Perl RegExp Crib Sheet (don't worry if it says Perl, the differences are very small and documented here in the section "Comparison to Perl 5")
- Using Regular Expressions in Java
- perlrequick - Perl regular expressions quick start (short introduction but somewhat Perl specific)
- perlretut - Perl regular expressions tutorial (rather detailed tutorial, also rather Perl specific)
- perlre - Perl regular expressions (all the details, also Perl specific but nevertheless an excellent reference)
Books
- Version Control with Subversion (an online introduction to the version control system)
- Mastering Regular Expressions
- Java Regular Expressions: Taming the java.util.regex Engine
Programs
- RegExplorer, a visual regular expression explorer
Infrastructure
Hardware
We have 6 machines at the LSIR at our disposition:
- lsir-cis-pc1.epfl.ch (128.178.156.150)
- lsir-cis-pc2.epfl.ch (128.178.156.151)
- lsir-cis-pc3.epfl.ch (128.178.156.152)
- lsir-cis-pc4.epfl.ch (128.178.156.153)
- lsir-cis-pc5.epfl.ch (128.178.156.154)
- lsir-cis-pc6.epfl.ch (128.178.156.155)
The machines have Intel Pentium 4 CPUs at 2.4 GHz, are equipped with 2 GB of RAM, and are running Linux (Fedora Core 3) with kernel 2.6.10. We also have two backup machines available, just in case anything should happen – you know Murphy's law. Please note that, during the first week of April (the Easter break), the machines may be down for some (short) time. The reason is that the room that contains the servers will get a new and more powerful cooling system.
Because we have no physical login we will be using secure shell to log into the machines. You will receive user names and passwords by e-mail as soon as the teams are made up (i.e. at the beginning of week 2).
To login you can use any EPFL computer that has an SSH client. If you connect from home you have to use the EPFL VPN client because of the firewall that seals off the EPFL network. The SUN stations in INF1 and INF3 have a command line client that you can use by entering the following in a terminal:
ssh userXX@lsir-cis-pcY
where userXX is the user name that you have been given and Y the machine number assigned to your team.
Please also read the How to use your account tutorial.
Software
- Java 2 Platform, Standard Edition 1.4
- PostgreSQL 7.4
- Java 2 Platform, Enterprise Edition (J2EE) 1.4
- PostgreSQL JDBC Driver
Subversion and CVS are also available on the machines for those who would like to use a version control system. (If you don't know what this is, then you should definitely have a look at this introduction.)
Advanced students only :-)
Below is a list of some possible alternatives that you may want to use for personal preferences. While we will try to do our best to support you if you do, we cannot guarantee anything. If you're in doubt stick to the suggested combination above or feel free to come talk to us.
- Java 2 Platform, Standard Edition 5.0 (a.k.a. 1.5)
- MySQL (≥ 4.1 highly recommended)
- MySQL Connector/J (the MySQL JDBC driver)
- Jakarta Regexp (Open Source Regular Expression library for Java)
The Project
The project is centered around music, music CDs to be more specific. You will gather different sorts of data about music CDs and try to combine them to form a comprehensive database. In a first phase each team chooses one of three categories of web sites and write a program ("web crawler") that collects interesting data. You then form consortia of three teams, each with a different category of data, and combine the data. This database will be accessible via a browse and search web interface to all other teams. In the final phase each team prepares for a bidding game. You develop and set up a web service that allows other teams to place bids for music CDs. Every team has a certain amount of (purely virtual :-) money that they can spend on bidding for CDs. This means that everybody is a supplier and a client at the same time.
Teams
The list of teams and the accounts they have been assigned is available here (only from EPFL or via VPN):
And here is the list of the consortia:
The address lists for the web sites containing the deliverables can be consulted here:
Phase 1
Keywords:
- Formation of teams and consortia
- Read documentation
- Get familiar with the infrastructure
- Database design and implementation
- Web crawler
Phase 2
Keywords:
- Exchange of collected data
- Merge data received from the other two teams
- Web interface for browsing/searching data
Phase 3
Keywords:
- Web services
- Bidding client and bidding server
- Preparation for game
Game
The bidding game concludes the project. You get to apply the web services you have developed in phase 3 and there's something to win. :-)
Schedule
Overview
The following table gives you a quick overview of the semester. The project is divided into three main phases. During the last two weeks the game will take place and you will have enough time to finish your final reports.
| Week | Date | Tasks |
| 1 | 11.3. |
Build teams Look at the data source web sites and pick a category |
| 2 | 18.3. |
Start phase 1 Form consortia and enroll for a category Familiarize yourself with the infrastructure Pick a web site as data source |
| 3 | 25.3. [1] |
Design database schema Implement the database |
| 1.4. | Enjoy the Easter break! :-) | |
| 4 | 8.4. |
Implement the database Develop web crawler & gather data |
| 5 | 15.4. | Develop web crawler & gather data |
| 6 | 22.4. | Develop web crawler & gather data |
| 7 | 29.4. |
Start phase 2 Exchange gathered data Merge data received from consortium |
| 8 | 6.5. |
Merge data received from consortium Develop web pages for browsing/searching data |
| 9 | 13.5. | Develop web pages for browsing/searching data |
| 10 | 20.5. |
Start phase 3 Develop web services for the game |
| 11 | 27.5. | Develop web services for the game |
| 12 | 3.5. | Develop web services for the game |
| 13 | 10.6. | Finalize phase 3 |
| 14 | 17.6. | Game |
[1] Friday 25.3. is a public holiday, so there is no course that day. For those of you who still want to get some work done that week (highly recommended) we will arrange a special Q&A session, probably on Thursday 24.3. 1400–1500. Please see the forum if you're interested.
Special dates and deadlines
Important dates and deadlines concerning the project can be found below. For those of you who like to burn the midnight oil: Unless specified otherwise deadlines are at midnight. :-)
Please make sure you meet the deadlines as not doing so will influence your grades. When you have to hand in something you will get a confirmation via e-mail. This means that if you do not receive confirmation within 24h after handing in something, there may have been a problem and you should try to contact us and/or resend your files.
| 2005-03-14, Mon | Enrol teams (details) |
| 2005-03-16, Wed | Each team receives their account data to the servers |
| 2005-03-18, Fri | Form consortia (enrolment during the exercise session, 1015-1200) |
| 2005-04-22, Fri | Deadline phase 1 (hand-in by e-mail to Martin) |
| 2005-04-27, Wed | Exchange of gathered data within consortia (self-organized) |
| 2005-05-19, Thu | Deadline phase 2 (hand-in by e-mail to Martin before 2400) |
| 2005-06-10, Fri | Deadline phase 3 (hand-in by e-mail to Patrik) |
| 2005-06-13, Mon | Start of the game (1200) |
| 2005-06-15, Wed | End of the game (1800) |
| 2005-06-17, Fri | Presentation of the game results |
Miscellanea
Problems
If you encounter problems (of technical or human nature), please don't hesitate to talk to us -- problems are best resolved early. Feel free to come by during the office hours, use the newsgroup, or send us e-mails.
Misconduct
This is the section that shouldn't have to be here.
The short version: Any form of cheating is inacceptable.
The long version: Copying partial or full programs from other teams will not be tolerated and result in a grade of 0 (in words: zero) for the corresponding phase for the copying group. Important: If we cannot determine who has copied from whom, both teams will get a zero grade. This implies that every team must secure their work (e.g. using chmod 700 for directories and chmod 600 for files) and their databases (no blank passwords, etc.). You should also avoid uploading your reports/web pages too early before the deadline. If you want you can protect the pages with a password and include it with the report you send by e-mail.
Note that we do in principle support code reuse and collaboration. This means that you don't have to reinvent the wheel. If you find a library that does exactly what you need for a given subtask, you are allowed (even encouraged) to use it as long as this happens "in a reasonable way". In this case, you must mention this fact in the report and include the source where you've found it. If you are in doubt about whether you are allowed to use a given piece of software, please ask us.
As far as collaboration between teams is concerned, we encourage you to help each other. By helping we mean "show how to do something", not "do somebody else's work". Of course, if you come up with a great idea together with another team, that's perfectly okay. Again in such a case, please include a short note with your report. Oh, and just in case: We still have last year's reports and we're not afraid to compare. :-)
To sum it up: Be reasonable and play fair!
Copyright © Martin Rubli & Patrik Bless –
Last change:
This page uses
valid XHTML 1.0 Strict and
valid Cascading Style Sheets, Level 2.
This page uses
valid XHTML 1.0 Strict.
This page uses
valid Cascading Style Sheets, Level 2.