WEBVTT 00:00.000 --> 00:19.000 This is an instruction video on how you can run the transcription with the software Whisper on audio data inside TSD. 00:19.000 --> 00:31.000 I will continue in this video that you have a TSD project that you have managed to log into. 00:31.000 --> 00:39.000 What we are going to do first is to copy the Whisper software via a server. 00:39.000 --> 00:48.000 To log in to the server, you must go into a shell and log in to the server. 00:48.000 --> 00:59.000 We use PuTTY in this example. There are several options, but we use PuTTY, which is inside all Windows machines in TSD. 00:59.000 --> 01:05.000 To find it, I can use search and search for PuTTY. 01:05.000 --> 01:10.000 I start PuTTY and it looks like this. 01:10.000 --> 01:18.000 What we are going to log in to now is a machine called Submit. 01:18.000 --> 01:23.000 In front of Submit, you must enter your project number. 01:23.000 --> 01:33.000 For me, in this example, it will be p896-submit. 01:33.000 --> 01:40.000 This is the server I'm going to log in to to run the commands and copy the software. 01:40.000 --> 01:56.000 I press open and log in as my user. 01:56.000 --> 02:02.000 In here, I have to run some Unix commands. 02:02.000 --> 02:14.000 What I do in my example is that I want to find the Durable map, because that's where I want to put Whisper on the computer. 02:14.000 --> 02:21.000 Default, I came into a user map. To see where it is, I can type p-double-d. 02:21.000 --> 02:24.000 Then I see that it is a home area. 02:24.000 --> 02:32.000 I can go into Durable and type cd-prick-prick. Then I come up one level. 02:32.000 --> 02:38.000 Then I can do it again and type p-b-d. 02:38.000 --> 02:41.000 Then I see that I'm right in my project. 02:41.000 --> 02:45.000 If I type ls, I can get some areas here. 02:45.000 --> 02:52.000 Then I know that if I type cd-data, then I come into data and can check again with ls. 02:52.000 --> 02:58.000 Then I see that I have Durable. I type cd-durable. 02:58.000 --> 03:11.000 I can do an ls and I see that I recognize my, for example, network attachments and network data folders that I have in Durable today. 03:11.000 --> 03:22.000 Now I'm going to copy this Whisper catalog, which is in a shared folder on a server, into this area. 03:22.000 --> 03:27.000 Until then, everyone must run this software locally. 03:27.000 --> 03:43.000 What can be done is to simply copy the command that is in the video and paste it. 03:43.000 --> 03:51.000 Because it is a copy command called cp-r to be able to have with some sub-catalogs. 03:51.000 --> 03:54.000 It is in the shared software Whisper. 03:54.000 --> 04:05.000 It is very copied. If it says in the middle of a point, it will end up where I am right now, which is in the Durable folder. 04:05.000 --> 04:17.000 This means that the catalog is now copied right under Durable. 04:17.000 --> 04:27.000 This will take some time, as there are some gigabytes of data, but that was it. 04:27.000 --> 04:34.000 Now I would like to go over to use explorers, because that's what you can do with File Explorer. 04:34.000 --> 04:36.000 You have probably used it before. 04:36.000 --> 04:42.000 I can go into my data folder on Durable. 04:42.000 --> 04:46.000 There is now a catalog called Whisper. 04:46.000 --> 04:54.000 Inside this catalog, there is now a script and a program. 04:54.000 --> 04:58.000 You can see that there is something called large.pt. 04:58.000 --> 05:01.000 This is the largest output of Whisper, which needs the most possible data power. 05:01.000 --> 05:07.000 In this case, it is the one with the best solution. 05:07.000 --> 05:10.000 We think that's the best right now. 05:10.000 --> 05:20.000 The data folder is mapped and the script is set up to take all the files that are in the folder and transcribe them. 05:20.000 --> 05:24.000 I'm going to find some data and I have some data that I have ready. 05:24.000 --> 05:27.000 A sound file and a video file. 05:27.000 --> 05:35.000 I can copy that and go into the data folder and paste them in. 05:35.000 --> 05:46.000 Now I want these to be transcribed and I have to run the script called transcribe data. 05:46.000 --> 05:55.000 This script can only be run from the Linux server that I have logged in to. 05:55.000 --> 05:59.000 I can't run it in Windows Explorer, but on this right server. 05:59.000 --> 06:11.000 It is important that you have access to Colossus, i.e. the IT system, to be allowed to run this program. 06:11.000 --> 06:18.000 The way to run this program is to write.score or transcribe data. 06:18.000 --> 06:25.000 The command is together with the video here and you can paste it into PuTTY. 06:25.000 --> 06:31.000 You can always paste data into TST, but you can't copy data out. 06:31.000 --> 06:34.000 That's a trick to get commands. 06:34.000 --> 06:49.000 If I run the command now, which is.score or transcribe data. 06:49.000 --> 06:58.000 I got an error message here, but when I type.pbd it is because I'm in the wrong folder. 06:58.000 --> 07:03.000 So I have to type cd whisper. 07:03.000 --> 07:06.000 I have to be in the whisper catalog. 07:06.000 --> 07:17.000 If I type the same command now, it says submitted batch job and a number. 07:17.000 --> 07:26.000 It means that it has been put in queue at the computer on Colossus. 07:26.000 --> 07:28.000 I have to wait for it to be finished. 07:28.000 --> 07:42.000 This can depend on the traffic on the computer and how many files you have put into transcription. 07:42.000 --> 07:47.000 Here I have put in both a sound file and a movie file. 07:47.000 --> 08:04.000 Then it's just a matter of waiting a little while and I can check in the file explorer if this has arrived. 08:04.000 --> 08:18.000 After a while, I can go into the folder data in the explorer and there I see that there are now three files. 08:18.000 --> 08:22.000 Each file that I have put in. 08:22.000 --> 08:28.000 This is what whisper today automatically sets up. 08:28.000 --> 08:34.000 It's a text file, a VTT file and a SRT file. 08:34.000 --> 08:41.000 With that I say good luck to transcribe data into TST. 08:41.000 --> 08:49.000 It's smart to take all the data from here and move them afterwards so that you don't transcribe everything again. 08:49.000 --> 08:54.000 Because in the data folder there must always be finished data. 08:54.000 --> 09:08.000 I can cut this and then I could just make a folder here called finished data, for example. 09:08.000 --> 09:12.000 And then I could say that I put those files there. 09:12.000 --> 09:21.000 So that I know that when I run the script, everything that is in the data folder is transcribed. 09:21.000 --> 09:24.000 Good luck.