GridFS Support in Spring Data MongoDB - codecentric AG Blog

:

MongoDB

MongoDB is a highly scalable, document oriented NoSQL datastore from 10gen. For more information have a look at the MongoDB homepage: http://www.mongodb.org. A short introduction to MongoDB can be found at this blog post.

GridFS

In MongoDB the size of a single record (i.e. a JSON document) is limited to 16 MB. If you want to store and query bigger binary data than that, you have to use the GridFS API of your MongoDB driver. There is also a mongofiles command line tool in the bin folder of your MongoDB installation. We will use that for our examples. In an empty database, there are no GridFS files:

C:\mongo\bin>mongofiles list
connected to: 127.0.0.1
 
C:\mongo\bin>

C:\mongo\bin>mongofiles list connected to: 127.0.0.1C:\mongo\bin>


But the first usage of GridFS leads to two new collections:

C:\mongo\bin>mongo
MongoDB shell version: 2.0.5
connecting to: test
> show collections
fs.chunks
fs.files

C:\mongo\bin>mongo MongoDB shell version: 2.0.5 connecting to: test > show collections fs.chunks fs.files

fs.files stores metadata, and fs.chunks holds the binary data itself. So far, these collections are empty:

> db.fs.chunks.count()
0
> db.fs.files.count()
0

> db.fs.chunks.count() 0 > db.fs.files.count() 0

Now we insert a first file by switching back to the command line and typing:

C:\mongo\bin>mongofiles put mongo.exe
connected to: 127.0.0.1
added file: { _id: ObjectId('501004130b07c6ab3fb01fa3'), filename: "mongo.exe", chunkSize: 262144, 
uploadDate: new Date(1343226899351), md5: "f5e82e7d4b7ae87a1d6e80dfc7f43468", length: 1885696 }
done!
 
C:\mongo\bin>

C:\mongo\bin>mongofiles put mongo.exe connected to: 127.0.0.1 added file: { _id: ObjectId('501004130b07c6ab3fb01fa3'), filename: "mongo.exe", chunkSize: 262144, uploadDate: new Date(1343226899351), md5: "f5e82e7d4b7ae87a1d6e80dfc7f43468", length: 1885696 } done!C:\mongo\bin>

Back on the mongo shell we can check our fs collections:

E:\mongo\bin>mongo
MongoDB shell version: 2.0.5
connecting to: test
> db.fs.files.count()
1
> db.fs.chunks.count()
8

E:\mongo\bin>mongo MongoDB shell version: 2.0.5 connecting to: test > db.fs.files.count() 1 > db.fs.chunks.count() 8

So uploading the mongo.exe binary produced one file that was split into 8 chunks. So much for that, for more details check out the help with

GridFS Support in Spring Data MongoDB

The Spring Data MongoDB project supports access to the GridFS API since the milestone release 1.1.0.M1. In general, Spring Data is another abstraction layer on top of the more low level MongoDB Java Driver:

MongoDB GridFS w/ Spring Data

Let’s see how to use the GridFS support. First, we grab the latest milestone release …

<dependency>
   <groupId>org.springframework.data</groupId>
   <artifactId>spring-data-mongodb</artifactId>
   <version>1.1.0.M2</version>
</dependency>

<dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-mongodb</artifactId> <version>1.1.0.M2</version> </dependency>

… from Spring’s snapshot repository:

<repository>
    <id>spring-snapshot</id>
    <name>Spring Maven SNAPSHOT Repository</name>
    <url>http://repo.springsource.org/libs-snapshot</url>
</repository>

<repository> <id>spring-snapshot</id> <name>Spring Maven SNAPSHOT Repository</name> <url>http://repo.springsource.org/libs-snapshot</url> </repository>

Spring Data MongoDB offers a GridFsTemplate to handle the GridFS operations:

<!-- Connection to MongoDB server -->
<mongo:db-factory host="localhost" port="27017" dbname="test" />
<mongo:mapping-converter id="converter" db-factory-ref="mongoDbFactory"/>
 
<!-- MongoDB GridFS Template -->
<bean id="gridTemplate" class="org.springframework.data.mongodb.gridfs.GridFsTemplate">
  <constructor-arg ref="mongoDbFactory"/>
  <constructor-arg ref="converter"/>
</bean>

<!-- Connection to MongoDB server --> <mongo:db-factory host="localhost" port="27017" dbname="test" /> <mongo:mapping-converter id="converter" db-factory-ref="mongoDbFactory"/><!-- MongoDB GridFS Template --> <bean id="gridTemplate" class="org.springframework.data.mongodb.gridfs.GridFsTemplate"> <constructor-arg ref="mongoDbFactory"/> <constructor-arg ref="converter"/> </bean>

With that template we can easily read the existing GridFS file we inserted before on the command line:

 @Autowired GridFsTemplate template;
 
 @Test public void shouldListExistingFiles() {
	 List<GridFSDBFile> files = template.find(null);
 
	 for (GridFSDBFile file: files) {
		 System.out.println(file);
	 }
 }

@Autowired GridFsTemplate template; @Test public void shouldListExistingFiles() { List<GridFSDBFile> files = template.find(null); for (GridFSDBFile file: files) { System.out.println(file); } }

Running the above example, you should see an output like this:

{ "_id" : { "$oid" : "4fe9bda0f2abbef0d127a647"} , "chunkSize" : 262144 , "length" : 2418176 , 
   "md5" : "19c2a2cc7684ce9d497a59249396ae1d" , "filename" : "mongo.exe" , "contentType" :  null  , 
  "uploadDate" : { "$date" : "2012-06-26T13:48:16.713Z"} , "aliases" :  null }

{ "_id" : { "$oid" : "4fe9bda0f2abbef0d127a647"} , "chunkSize" : 262144 , "length" : 2418176 , "md5" : "19c2a2cc7684ce9d497a59249396ae1d" , "filename" : "mongo.exe" , "contentType" : null , "uploadDate" : { "$date" : "2012-06-26T13:48:16.713Z"} , "aliases" : null }

In order to access the content of the file, you call GridFSDBFile#getInputStream. You can also store and delete GridFS files quite easily. Have a detailed look at the GridFsTemplate.

The full source code of the above example can be found at github.