Beskjeder
Gamle eksamensoppgaver fra 2011-2016 finner dere her.
Det er ikke tillatt med noen hjelpemidler p? ?rets eksamen. Ingen bok, ingen notater, ingen kalkulator. Samme opplegg som alle foreg?ende ?r, med andre ord.
Dere vil kunne besvare eksamen enten p? engelsk eller norsk. Bruk det spr?ket dere er mest komfortable med.
Pensumliste:
- L?reboka, minus kapitlene 11-12 og 16-18.
- Alle slides og supplerende artikler brukt i undervisningen. Se ogs? forelesningsplanen.
- Alle ?vingsoppgaver samt materiell brukt i gjennomf?ringen av ?vingsopplegget.
Gamle eksamensoppgaver (med l?sningsskisser) gj?res tilgjengelige snart.
Gruppetimen i dag (04.05) er dessverre avlyst grunnet sykdom. Eventuelle sp?rsm?l kan sendes p? e-post eller Piazza.
We have been made aware of an error in the test "testWeScienceFuzzy()":
Original:
assertTrue(firstResult.getDocument().getOriginalData().startsWith("[10340610]") || firstResult.getDocument().getOriginalData().startsWith("10840450"));
Correct:
assertTrue(firstResult.getDocument().getOriginalData().startsWith("[10340610]") || firstResult.getDocument().getOriginalData().startsWith("[10840450]"));
The mistake was the missing "[]" in the original test-code.
The slides from Rolf Michelsen's guest lecture today can be found here.
On 3/4, you don't have to listen to my monotonous droning voice. Instead, you're in for a treat!
Rolf Michelsen will hold a guest lecture titled Information Retrieval versus The Real World. Rolf is a seasoned industry veteran and has worked with large-scale search and search-related technologies for the last two decades in companies such as Fast Search & Transfer, Microsoft and Cxense.
Be there! :-)
Due to a scheduling conflict, I have to cancel the lecture tomorrow 27/3. I've updated the lecture plan accordingly.
Last Monday I briefly mentioned that I'd post a link to a good tutorial on how neural networks are used in NLP. Here it is!
As has been pointed out, the precode for assigment-d has no way of retreiving number of documents for calculating the idf-value. For a quick fix, simply hard-code the number of documents to be 10000, this should make the tests pass.
The tests assume that the tf-idf weighting uses an ntn variant (see slide 41 in lecture 6), so implementations with other variants will most likely not pass the tests, even though they are correct.
Just so that it's clear: As previously mentioned, some of the slides used in this course were created by the authors of the textbook for a course held at Stanford. Hence, when one of the slides used today mention "a midterm on the 22nd", that is a reference to the Stanford course -- there is no midterm on the 22nd in INF3800/INF4800! :-)
I ran out of time last Monday and didn't cover Chapter 8 and metrics for evaluating relevancy. Will play catch-up tomorrow!
Because of low interest in group sessions on Wednesdays, these will no longer be. The group sessions on Thursdays will continue as normal, but now with two teaching assistants. Keep using Piazza for asking and answering each others questions!
Det blir ingen forelesning mandag 20/2 grunnet reisevirksomhet.
We have now created a Piazza for INF3800/4800. This page can be used to ask and answer the questions of fellow students. To sign up, follow this link. Please read through the "Welcome to Piazza!" before posting any questions.
The textbook mentions Soundex when it comes to doing phonetic searching and retrieval based on a "soundslike" operator. In Monday's lecture I mentioned the improved Double Metaphone algorithm. You can also take a look at some source code to get a better feel for how the algorithm works.
The "Hardware basics" section in Chapter 4 is somewhat obsolete, e.g., it's ten years old and doesn't mention SSD disks. Here are some more up-to-date numbers, for reference.
Obligkalender med oversikt over publiserings- og innleveringdatoer for obliger er n? ute, og ligger under Ressurser p? h?yresiden p? siden. F?rste obligatoriske oppgave vil bli publisert i l?pet av dagen!