Extensis Logo SUPPORT | FORUMS | KNOWLEDGE BASE

First masive file and metadata import


#1

Dear everybody,

We are a secondary school that recently has bought Extensis Portfolio with all the extensions including the API. After the installation and configuration of the Extensis Software on a Win 2012R2 Server we face the dauting task of importing thousands of files (mainly videos) and the corresponding metadata from all our 200 teachers.

We have thought to proceed in this way:

  1. We have created an small java program for the teachers that runs on a Website that will as an output:

    a. store the file in a folder controlled by the Extensis Software. So the files will automatically imported in our catalog.
    b. create a metada file with the Options the teachers has choosen.

  2. Now the Problem is how to import all these metada files. With thousands of files doesn’t make sense to Import the metadata manually. So we are wondering how could we automatize this function. I guess that this can be done using the API.

So, my questions are:

a. Can we use the API to automatize the massive importation of metadata?
b. Is there an easiest way to accomplish this task without using the API?

Thank you for the answers in advance.

Best regards

Jaume Sampériz
IT KSSO
Switzerland


#2

Hi Jaume,

Welcome to Portfolio!

You can absolutely accomplish what you describe using the Portfolio API, and the API is the right tool for that task. In particular, there is a method in the API to add properties (like “keywords” or “description”) to assets, and depending on how you set up your catalogs and what file types the videos are, Portfolio can then embed those values into the video files themselves.

The other part of your task that you’ll need to figure out is how to associate the metadata files you create with the video files you create. You could do this with a naming scheme, so if you create a video called “Test123.mpg”, then you could create a metadata file named “Test123.txt”. That way, after Portfolio has cataloged the video file, you know which asset to add the metadata to.

We have both a REST and SOAP API that would accomplish your goal. We will be happy to help point you to example code and documentation, provide assistance if you run into problems, or if you just want to walk through your solution in more detail, just let us know.

Cheers,
-Loren


#3

Hi Loren,

I have spoken with our new young developer and we will go for the REST option.
Could you provide us the example code and documention that you mention in your post?

Cherrs,

Jaume
IT KSSO


#4

Hi Jaume,

The REST API documentation can be downloaded here: viewtopic.php?f=31&t=213

We’ll get you some example code to import metadata ASAP.

Cheers,
-Loren


#5

:ugeek: Greetings, Jaume !

The most efficient method to handle your video file + metadata uploads from staff would be to allow them to upload directly into your Portfolio system ( :arrow_right: e.g. playground.extensis.com/api/dropzone/ ).

You would need to use Portfolio’s SOAP API , but you may be able to modify your existing Java code so that it can interact directly with Portfolio ( :arrow_right: e.g. doc.extensis.com/api/portfolio/C … .html#java ).

:bulb: In regards to your recently cataloged videos, note that you can easily import the corresponding metadata via the Portfolio Web interface using Filename as the “key” ( :arrow_right: e.g. helpdocs.extensis.com/en/portfol … ght=import ).


#6

Hi Jaume,

After reading James excellent suggestions, and re-reading that it sounds like you would like to avoid using the API, we have three ways that we think you could accomplish what you want.

  1. If you have not yet cataloged the files into Portfolio, you could use a 3rd party library to embed your metadata into the original files themselves, before Portfolio is even involved. Then once the metadata is embedded, you can catalog the files in Portfolio via an AutoSync folder, and Portfolio will extract the embedded metadata and store it in its database. This would involve you writing a little code to do the embedding.

  2. If the files have already been cataloged into Portfolio, or if you do not want to embed into the original files, then you can use the Import Data feature in Portfolio, as James wisely suggested. You can create a text file with all of the metadata for all the files, and use the assets’ paths (or possibly filenames) as the keys in that file. This option involves you writing the least code (just enough to write out the text file that will be imported), and again lets you avoid using the API entirely.

  3. Finally, you could use the API method updateFieldValues to add the metadata. This obviously means you would write some code using our API.

I have two questions for you which will help us advise how to proceed:

  1. Is this a one-time import, or is this something you are going to want to do periodically with new assets?
  2. Are the assets’ filenames unique?

If you let us know the answers to these questions, and which of the three options above you would like to pursue, we can provide you with some more details on how to proceed.

Thanks,
-Loren


#7

Hi Loren and James,

Sorry for the delay and thank you very much for the posts. Here are my comments and questions:

  1. The assets’ filenames will be unique. And as you and James mention, we will use the filename as a key for matching assets and metadata.

  2. We haven’t imported any definitive files at the moment. The plan is to import the current digital libraries from the different school departments (music, physic, chemistry, art…) and then allow the teachers to update Extensis with new files. This means we will have a one-time big import and successive updates on an expected daily basis.

  3. The school staff does not want to make the task to manually import and catalogue the files. So it will be the teachers the ones that must do the job through some kind of software as we don’t want our approx. 200 teachers to have direct access to the Portfolio website (either in port 8090 or 8091).

  4. Currently we have already functioning small Java software for the teachers that run on a website that makes essentially the following:

a. Allows the teacher to select a file and upload it to the correct “autosync” folder controlled by the Portfolio Server. The software performs some checks as filename uniqueness.
b. Allows the teacher to fill the metadata and creates a text file with the metadata and the filename. In our tests, we import this file manually into Portfolio without problems.
c. This software has almost the same functionality that playground.extensis.com/api/dropzone/

  1. The Java software runs on GlassFish Server Open Source Edition 4.1.1 installed on the same server as the Portfolio Server. For development we have NetBeans IDE 8.1 and Java SDK 8 Update 77

  2. As our Java software already works as desired except for the automatic importation of data and we don’t have experience with 3rd party software for embedding metadata, it seems to me that it is a better idea to try to stick to our plan and not embed the metadata.

  3. For the one-time import I can easily create a big txt file with the metadata for a lot of files and import it manually. But I cannot use this method to the daily import of files as I don’t have the time to do it and the school staff assigned to the library has made it clear that they will not do it. So some kind of automatization is needed.

  4. playground.extensis.com/api/dropzone/ seems almost what we need. Is it possible to have the code of this program, modify it and have it running on the Portfolio Server?

Best regards,

Jaume
IT KSSO


#8

Hi Jaume,

I just wanted to let you know that the team is actively discussing how we can best help you craft a solution to your problems. We are pursuing a couple possibilities. We will get back to you in a couple working days with some more concrete options for next steps.

Thanks!
-Loren


#9

Hi Jaume,

I just wanted to check in and see how your investigation is proceeding.

I believe our ICS engineers contacted you about using DropZone.

Please let me know if you have any current questions on the API, or need any assistance deciding between the available technology options.

Thanks,
-Loren


#10

Hi Loren,

Yes, we were conntacted by your engineers and told about DropZone but we decided to try first to develop a custom solution. Our developer will finnish the prototype of the solution next week and a test period will follow. It looks like at this moment that our custom solution is working as expected.

On the other side, I have already finished the importation of the art department collection and found some features missing when using a NetPublish website. For example, while I’m able to import groups of users from our Windows AD , I cannot filter the acces to a Netpublish website using these groups. I have to select the allowed users of these websites manually between houndreds/thousands!!! of potential users. Do you know where can I request such a feature?

Best regards,

Jaume
IT KSSO


#11

Hi Jaume,

You are welcome to post feature requests here, I will make sure our Product Manager sees and records them.

We definitely hear you about the inconvenience of adding thousands of users to a NetPublish site. In the most recent version of Portfolio, 2.1.4, we added the ability to select/deselect all users when adding them to NetPublish. Hopefully that helps a little in the meantime.

-Loren