But how do you do it? You quickly realize that it’s not a technical problem, but an informational one. What is the right taxonomy? How do you select metadata? Do you have to establish policies and procedures to keep users from destroying your beautiful system, once it’s been set up?
Many books have been written on this subject. My search through them for a quick solution was mostly frustrating. The experts simply refuse to say, “organize your files this way:” for the same reason, I suppose, that doctors don’t prescribe for strangers over the telephone: Individual cases differ enough that it’s easy to get the prescription wrong, even if you’ve been able to correctly diagnose the problem.
Pretty soon you get the idea that the users in each department must help design their new file system. They must tell you what the important categories of documents are, and what are the features of the documents across these categories that should be included in metadata, in terms of their department’s needs. It’s still easy to get things wrong, however, because users simply may not be able to do this. Worse, they may think they know what they need when they really don’t.
What Do Users Know?
Even if you’ve been wise enough not to just ask users what folders or metadata would be useful, and instead you’ve asked them what actions they perform on documents in the context of a process they often go through, you may still find that they have a hard time expressing what they do. It’s been called tacit knowledge: they do things without being conscious of how they do them. Their knowledge isn’t specific enough that it’s useful for reorganizing things.
One expert I’ve read, Alfred de Weerd, proposes interviewing multiple people about their use of documents in processes, comparing their different views, and coming back with more detailed questions. It’s also common to hold a brainstorming session or to create a workable demo of a content management system that the users can criticize. I’m working on another method, however, presenting users with a sort of non-working model, the old file structure.
Exposing the Issues
The question then is how to present the old structure so that its vices are apparent. Once you’ve identified them, you can ask users, “Why did you do it this way?” Better, ask, “Why did somebody do it this way?” People make many mistakes for good reasons. The road to Hell is paved with good intentions. Discover the good intentions, and you have a basis for a new system.
So far, I’ve discovered two relatively quick ways to expose issues with the current system. One is to see how all its files are named and organized. On a traditional Windows file server, this is not hard to do. You just expand the tree view of the files until all are visible.
This shows you a lot, such as whether the same lists of files appear in multiple places. For example, you may have a list of grantees in a folder of letters of intent, and again in a folder of full proposals, and again in a folder of approved proposals, and again in a folder of signed grant letters.
You can also see if the folder names make sense. You may see subfolders called “For Audrey” in all the grantee files for 2013, but for no year before or since. Other idiosyncratic file names may show up, once the file structure is opened for all to see.
Dealing with Duplicates
The second thing that is easy to do is to check for duplicates, or near-duplicates. For Windows servers, several products exist to automatically extract and display all the duplicates on a server. I’ve had some success with DuplicateCleaner, but you can decide what works for you. Duplicates tell you a lot. First, their mere existence can be confusing. If you do a search and get seven folders with the same name, which do you pick? Worse is when you find near-duplicates that are different versions. Which is the right one?
By looking at the different locations of the duplicates, you also get some insight into what processes put them there and how the file structure reflects those processes. So, one duplicate is in proposals and another is in approved grants and another is in signed grants, etc. I’ve even seen a file structure in one department that was a perfect duplicate of the structure in another, which raises the question of whether both are necessary.
This is when you ask the appropriate users, “Why did somebody do it this way?” You may also ask, “Can you imagine ever using this stuff again?” At this point, many other questions may also come to mind.
SharePoint Tree Views
I called this article “Fix Your SharePoint File Structure,” so here are some ideas about how to get a tree view for a SharePoint library and how to find duplicates. There’s a nice discussion about how to build a tree structure on answers.microsoft.com. It’s a little complicated, and you will need administrator rights. I’ve used it recently, however, and it works. Please note that there used to be simpler ways to do this, but Microsoft recently eliminated the “Content and Structure” page under “Site Administration” in SharePoint Online, so don’t try to find that.
As for duplicate finding, there is an application that claims to do this for SharePoint the way others do it for Windows servers. It costs $495 and doesn’t have a free trial, however, so I haven’t tried it. The other way is to use Office 365 search. Not all document libraries have this search, I’ve noticed. The one you want is the Classic View search that can search all of SharePoint and “Everything.” On the site that didn’t have this in its SharePoint document libraries, I found it in OneDrive. Basically, you do your search in OneDrive, fail, then expand the search to all of SharePoint by clicking the link at the bottom of the page.
Of course, to find duplicates, you must be able to view them. This option may be turned off, like so many in SharePoint that have caused me to pull out my hair. Here’s an article that tells you how to do it, but you’re going to need some SharePoint editing knowledge to understand it. Basically, you do a search with the good search interface, such as the one I found in OneDrive. Then you edit the page, which will display a box with settings when you click the little black triangle (Microsoft is built on little black triangles) at the upper right-hand corner of the Search Results box. In the box, check the Show View Duplicates link. Save your changes and exit editing.
Now when you examine your search results, “View Duplicates” will show up under the pop-up document preview window. Clicking on that will produce a list of all the duplicates of the file. Critics of the system say it’s not perfect, but it’s good enough to be helpful.
This discussion is only the beginning of what you need to build that new SharePoint file structure. But hopefully it gets you past one stumbling block and to the point where those books by the experts can take you the rest of the way. Good hunting!