Vyapin Blogs

August 2, 2011

DocKIT: Rename folders / files using metadata column values

Most of the content migration to SharePoint involves migration of archived and active contents from different systems such as file servers, network shared folders and other document management systems (e.g., Humming Bird, DocuShare, DocsOpen). This is an impending task apart from migrating frequently changing folders and files in the current storage system on a day-to-day basis.

It is a well-known fact that the contents stored in these systems might have their folder and file names coined as per legacy naming conventions or standards prevailed at that time. There also exists a possibility of the folder and file names having special characters that are valid in legacy systems but invalid (disallowed) in SharePoint. During migration of contents from legacy systems to SharePoint, users might want to get rid of the old names and replace them with new names, rename invalid to valid characters in folder and file names.

Apart from this, users may have a possible need for renaming folders and files that are already available in SharePoint.

In such cases, organizations and users who migrate bulk contents into SharePoint need user-friendly means with varied options to rename those folders and files in bulk during migration / post migration.

In a nutshell, we highlight the following general and specific cases of renaming requirements:

Rename folders & files while migrating from legacy systems to SharePoint
Rename folders & files with invalid characters to valid characters in SharePoint
Rename folders & files that already exist in SharePoint

Folder level requirements:

Create a new folder hierarchy (restructure existing contents) and import files into them during migration
Create new document sets by way of renaming folders

Document / File level requirements:

Rename files of different names and related contents and import them as document versions (maintain version history)
Rename files having names for each version (appended with version number) and import them as document versions

To cater to the above mentioned renaming requirements, DocKIT for SharePoint 2010 product provides easy renaming of folders and files in the following two ways:

  1. Rename folders & files using the values specified in an external metadata file or batch descriptor file.

  2. Rename folders & files using folder & file naming rules.

Rename folders & files using the values specified in an external metadata file / batch descriptor file

I will explain how to use the external metadata file or a batch descriptor file to rename folders and files in the most common cases using a few examples below:

1. Rename files simply by providing alternate names

2. Rename folders simply by providing alternate names

3. Rename files that have same extension

Let’s take an example where files generated by an external system with the same extension (e.g., “.01”). In this case, the end user could be interested in providing suitable alternate file names for meaningful representation. Moreover, since MS Office files (like - .docx, .doc, .xlsx, .ppt, .doc, .xls,) are handled by SharePoint differently, renaming those files would be useful if they are stored with correct file extension.

System generated file name New file name Description
Action Plan.01 Action Plan.docx This file is generated by Microsoft Word 2010 application.
Status Info.01 Status Info.doc This file is generated by Microsoft Word 2007 application.
Resource Allocation.01 Resource Allocation.xlsx This file is generated by Microsoft Excel 2010 application.
Time Schedule.01 Time Schedule.xls This file is generated by Microsoft Excel 2007 application.
Architecture.01 Architecture.pptx This file is generated by Microsoft PowerPoint 2010 application.
Presentation.01 Presentation.ppt This file is generated by Microsoft PowerPoint 2007 application.

It would most helpful if the end user is able to provide suitable alternate name by appending the extension corresponding to the native application that generated the file / document.

You can achieve this by specifying (source file path, new file name) in an external metadata file or the batch descriptor file.

By using an external metadata file




4. Create a new folder hierarchy and import files into them during migration

One other specific usage scenario that you may encounter is to migrate all files in a folder to a new sub-folder in the destination location. This requirement can be easily achieved through “Batch Descriptor file” task option in DocKIT.

You can achieve this by stating the desired folder hierarchy under New Folder column while specifying batch file entries. Also, you have to set ‘No’ for ‘Do you want to create the top-level folder(s) included for import?’ in ‘Folder Options’ step in DocKIT Task Wizard.



5. Create a new document set by way of renaming folders on the fly and import files into them during migration

In order to leverage the use of newly introduced Document Set feature in SharePoint 2010, you would be interested in creating Document Set by using the existing folders during migration into SharePoint. DocKIT comes in handy to address this unique requirement.

To accomplish this, you have to specify the new folder name which has to be created as document set and the corresponding folder content type (created using ‘Document Set’ as its parent content type) to be used for creating the new folder as document set.

After creating the specified folder as document set, DocKIT will create its sub-folders as document sets and migrate the files under the respective sub-folders.


Adding custom document set content type to SharePoint library:


6. Rename files that already exist in SharePoint by providing alternate names

DocKIT provides the option to rename the files that are already available in SharePoint. It works in a similar fashion like new files that are imported into SharePoint, but for the Path column. The
Path column should be replaced by Destination Path column and should contain the file URL of the list item available in SharePoint.

You can use only batch file syntax to rename files that are already available in SharePoint. In the batch file, apart from the New Name column, only Destination Path and not the Path will be specified.

7. Rename folders that already exist in SharePoint by providing alternate names

Similar to renaming an existing file in SharePoint, DocKIT facilitates existing folder rename option. As indicated already, you can use only the batch file syntax to rename folders and files that are already available in SharePoint. In the batch file, you have to specify the New Folder column, apart from the Destination Path column.

Rename folders & files using folder & file naming rules

We will discuss about another interesting and useful option in DocKIT to rename folders / files based on patterns, remove illegal characters in folders/file names while migrating to SharePoint in the next blog post.

Read: DocKIT: Rename folders/files based on patterns using renaming rules

DocKIT: Rename folders/files based on patterns using renaming rules

In continuation with the previous blog post that deals with the folder and file rename options, here we discuss how to deal with renaming of folders/files that contain illegal characters.

As said earlier, folders and files residing in legacy systems will have having special characters that are invalid (disallowed) in SharePoint. DocKIT facilitates this renaming by providing folder and file naming rules while migrating into SharePoint.

Certain users might also need migration of different versions of a file with their version named with certain patterns. DocKIT also supports this file version history creation seamlessly.

Rename folders & files using naming rules

This option is intended /recommended for renaming folders & files that have characters not accepted in SharePoint as part of their names.

SharePoint disallows certain invalid characters in the folder and file names. In order to overcome this behavior, you need to replace certain characters that appear in Windows folder & file names and are invalid folder / file characters in SharePoint libraries, with valid characters (all characters except ~ # % & { } * | \ : < > / ?) in SharePoint.

For this, you can use the “Folder & File Renaming Rules” feature in DocKIT. This feature in DocKIT uses ‘Regular Expressions’ technique that is widely used in software applications where character pattern matching is crucial.

Consider the following cases:

Example 1: Order of entries when constructing renaming rules

The Folder and File Renaming rule in DocKIT requires the rule entries to be correctly placed for successful and meaningful replacement of invalid characters. The order of the entries (sequence) is important depending on the name of the files and folders. For example, if you want dot and double underscore OR triple dots to be treated as a single entity, then a rule entry with this value has to be given first. After this, you should provide dot and single underscore OR double dots value as entry.

Example:

Invalid File Name Valid File Name Description of the Renaming requirement
Sa…ple File 1.doc Sample File 1,doc Replace 3 dots with ‘m’
Sam..le File2.doc Sample File2.doc Replace 2 dots with ‘p’
_ample File3.doc Sample File3.doc Replace_ with ‘S’
.Feb Report.xlsx Feb Reprt.xlsx Remove dot if the file name starts with . (dot)
Jan Report.xls Jan Report.xls Remove dot if the file name ends with . (dot)

Replace two consecutive dots
with one

._March Report.pdf March Report.pdf Replace ._ with empty
.__ April Report.txt April Report.txt Replace .__ ( dot, two underscores) with empty
__ May Report.pptx May Report.pptx Replace __ (two consecutive underscores) with empty
._. June Report.ppt

June Report.ppt

Replace ._. (dot, underscore,
dot) with empty

Equivalent Naming Rules:

Find Replace Description
\.\.\. m Replace 3 dots with ‘m’
\.\. p Replace 2 dots with ‘p’
\.* Remove dot if the file name starts with . (dot)

*\. Remove dot if the file name ends
with . (dot)

\.__ Replace .__ ( dot, two underscores) with empty

\._\. Replace ._. (dot, underscore, dot) with empty

\._ Replace ._ with empty

__ Replace __ (two consecutive underscores) with empty

_ S Replace _ with ‘S’

Figure 1: Naming rule as specified in DocKIT application

Example 2: Usage of reserved characters

The ‘Regular Expressions’ technique used by DocKIT for pattern matching categorizes some of the operators as reserved and has its own meaning for the operators in the pattern syntax. In order to treat the reserved characters as normal characters like ‘A’, ‘1′, you have to use ‘\’ (backslash) as an escape sequence character in DocKIT Renaming rule as given below:

Invalid file name Valid file name Description Naming rule
Find Replace
My File name#.xls My File name.xls Replace # with empty \#
Programs & Schedules.docx Programs and Schedules.docx Replace & with ‘and’ \& and
My $ File name.docx My File name.docx Replace $ with empty \$
policydoc *74565.xls policydoc 74565.xls Replace * with empty \*
Sample.Doc SampleDoc Replace . with empty \.
Sample?File SampleTextFile Replace ? with ‘Text’ \? Text
Test^Document Testdocument Replace ^ with empty \^
Sample+File SampleTextFile Replace plus with ‘Text’ \+ Text
Word<Document WordDocument Replace < with empty \<
PDF>Document PDFDocument Replace > with empty \>
[Document Document Replace [ with empty \[
Sample] SampleFile Replace ] with ‘File’ \] File
(File DocumentFile Replace ( with ‘Document’ \( Document
TextFile) TextFile Replace ) with empty \)

Create file version history by using certain naming patterns

We will discuss a situation where certain files in a folder have their names appended with version numbers. If you would like these version numbers to stack up nicely as version history of the file / document, this is what you can do.

Example 1: Uniform file naming pattern for file version

Invalid File Name Valid File Name Description

Naming Rule

Find Replace
My File name_v1.0.docx My File name.docx Replace underscore, v and version number appended with the parent file name with empty so as to create as respective versions of a file
First version
*_v??\.??
My File name_v1.1.docx My File name.docx Next minor version
My File name_v1.2.docx My File name.docx Next minor version

Example 2: Random file naming pattern for file version:

Invalid File Name Valid File Name Description

Naming Rule

Find Replace
My doc_v1.0.doc My doc.doc Replace underscore, v and version number appended with the parent file name with empty so as to create as respective versions of a file
First version
*_v*
*v??\.??
v
My docv1.1.doc My doc.doc Next minor version
My docv1.2.doc My doc.doc Next minor version

Note: Entries under the Find section should be in the same order as listed.

Figure 2: File Settings in DocKIT Task Wizard

To create version history, select ‘Create new version’ option in ‘File Settings’ step in DocKIT Task Wizard.

For more information, please refer the following help sections in the online help document. You can launch the help document from Help menu (Help -> Contents menu) in the application.

  • DocKIT Features -> Folder & File Renaming Rules
  • DocKIT Special Cases -> Rename Documents in File System

June 21, 2011

Why do I get this error in DocKIT?

Why do I get this error in DocKIT? - “The column values could not be assigned for this file, since there was no corresponding entry in the external metadata file”

DocKIT assigns metadata when migrating documents to SharePoint. DocKIT can also assign metadata to documents that already reside in SharePoint. You can assign the same metadata for all documents / files in a folder or different metadata for each of the documents / files in the folder. 

In order to accomplish this, you have to specify absolute path of file or folder with the appropriate metadata for the SharePoint columns. The absolute file / folder path is dependent on the source from where files/folders are added for migration or on the destination SharePoint location. 

If the path in the external metadata file and the path of the added source contents do not match, then this error message: [”The column values could not be assigned for this file since there was no corresponding entry in the external metadata file.”] will be reported by DocKIT application.

Please note the following basic requirements about the contents in the external metadata file (provided as input to DocKIT software): 

DocKIT expects the first column in the metadata file to be named as ‘Path’.
[PATH] in the metadata file should refer to the path of the file or folder location, whose location is specified in any of the following formats:
  Local path: C:\My Documents\Versions Demo\SampleTest1.doc (or)
  UNC path: \\SPServer\MyDocs\Sample1.doc (or)
  Mapped drives: M:\MyDocs\Sample1.doc
The path from which folders / files were added for import using DocKIT user interface must exactly match the path specified under ‘Path’ field in the metadata file. In other words, if you added folders / files from the D: drive (e.g., D:\My Documents) in the DocKIT explorer window, you must ensure that the path field in the external metadata file uses the D: drive to point to this specific file or folder.
The columns listed in the metadata file should be available in SharePoint prior to metadata assignment.

I will now explain how to assign document metadata correctly in a few common scenarios using some examples.

Example 1: Import files and folders from local file system to SharePoint and assign metadata

 

Example 2: Import files and folders from network share (UNC) to SharePoint and assign metadata.

 

Example 3: Import files and folders from mapped network drive to SharePoint and assign metadata.

 

Example 4: Assign metadata for files already available in SharePoint

« Previous PageNext Page »

Powered by WordPress