Ever tried to find a blob store that can work on-premises as well as in a cloud, support meta-data, scale well and have .NET client libraries?
I did and stopped on MinIO. Well, honestly to my surprise I was quite limited in my choice. It's free, it's open-source, it can work on-premises and has helm charts for k8s. The best thing is that its S3 compatible, so if one day you move to the cloud the only thing you`ll need to change in your code is a connection string.
The easiest way to start is by starting a docker image. Pull the image:
start for testing (data will be part of the container, so after a restart, all files will be gone
Or start with a mapped image in windows:
When the server is up you can access it by http://127.0.0.1:9000/minio/login
default user/password:
minioadmin/minioadmin
https://docs.min.io/docs/dotnet-client-quickstart-guide.html
but it looked quite raw to me. For example, I`m not sure why GetObjectAsync is not returning Blob info even though internally it loads it every time, this way I have to make one extra call for each file. Or why stream operation is an Action and some of the operations with steams and files internally are not async. Anyway, it's open-source (https://github.com/minio/minio-dotnet),so you can take a look yourself or even contribute :)
https://www.nuget.org/packages/AWSSDK.S3/(remember, MiniO has S3 compatible API)
I had to play a bit with connection params, but quicky found a combination that worked and allowed to connect to MiniO on premises:
2) Bucket name has limitations like it should start with a lowerCase letter. In some cases, SDK will throw an exception, but in others, you might just receive an empty object as a response, so its good to conform your bucket name for both create and get.
3) There is no built-in method to check if a bucket exists. I decided to use 'GetBucketTaggingAsync' method as doesn`t throw when you try to access nonexisting bucket.
As you see now we receive all object headers (metadata) together with a reference to a stream in one call
That's it, hope that was useful.
I did and stopped on MinIO. Well, honestly to my surprise I was quite limited in my choice. It's free, it's open-source, it can work on-premises and has helm charts for k8s. The best thing is that its S3 compatible, so if one day you move to the cloud the only thing you`ll need to change in your code is a connection string.
The easiest way to start is by starting a docker image. Pull the image:
docker pull minio/minio
start for testing (data will be part of the container, so after a restart, all files will be gone
docker run -p 9000:9000 minio/minio server /data
Or start with a mapped image in windows:
docker run -p 9000:9000 --name minio1 \ -v C:\data:/data \ minio/minio server /data
When the server is up you can access it by http://127.0.0.1:9000/minio/login
default user/password:
minioadmin/minioadmin
Working with .NET
Minio has its own .NET Clienthttps://docs.min.io/docs/dotnet-client-quickstart-guide.html
but it looked quite raw to me. For example, I`m not sure why GetObjectAsync is not returning Blob info even though internally it loads it every time, this way I have to make one extra call for each file. Or why stream operation is an Action and some of the operations with steams and files internally are not async. Anyway, it's open-source (https://github.com/minio/minio-dotnet),so you can take a look yourself or even contribute :)
Using S3 .NET SDK with MinIO
So, after reviewing Minio SDK I decided to give a try to native amazon S3 SDKhttps://www.nuget.org/packages/AWSSDK.S3/(remember, MiniO has S3 compatible API)
I had to play a bit with connection params, but quicky found a combination that worked and allowed to connect to MiniO on premises:
var awsConfig = new AmazonS3Config()
{
ServiceURL = "http://127.0.0.1:9000",
ForcePathStyle = true,
UseHttp = true
};
AWSCredentials creds = new BasicAWSCredentials(config.AccessKey, config.SecretKey);
_s3Client = new AmazonS3Client(creds, awsConfig);
Now we can store the first object:
var request = new PutObjectRequest();
request.BucketName = bucketName;
request.Key = $"{Guid.NewGuid()}{Path.GetExtension(incomingFile.FileName)}";;
request.ContentType = incomingFile.ContentType;
request.InputStream = incomingFile.Data;
await _s3Client.PutObjectAsync(request, cancellationToken).ConfigureAwait(false);
Some things to pay attention to:
1) When you work with metadata you need to prefix every key with "X-Amz-Meta-" or it will be added during persistence.const string S3Prefix = "X-Amz-Meta-"; const string FileNameField = S3Prefix + "Filename"; Metadata[FileNameField] = fileName;
2) Bucket name has limitations like it should start with a lowerCase letter. In some cases, SDK will throw an exception, but in others, you might just receive an empty object as a response, so its good to conform your bucket name for both create and get.
3) There is no built-in method to check if a bucket exists. I decided to use 'GetBucketTaggingAsync' method as doesn`t throw when you try to access nonexisting bucket.
async ValueTask<bool> CreateBucketIfNotExist(Bucket bucketName, CancellationToken cancellationToken = default(CancellationToken))
{
var request = new GetBucketTaggingRequest { BucketName = bucketName };
var result = await _s3Client.GetBucketTaggingAsync(request, cancellationToken);
if (result?.HttpStatusCode == System.Net.HttpStatusCode.NotFound)
{
await _s3Client.PutBucketAsync(bucketName, cancellationToken);
return true;
}
return false;
}
Getting and processing the data is quite easy
var request = new GetObjectRequest {Key = objectId, BucketName = bucketName};
GetObjectResponse result= await _s3Client.GetObjectAsync(request, cancellationToken);
await result.ResponseStream.CopyToAsync(fileStream);
As you see now we receive all object headers (metadata) together with a reference to a stream in one call
That's it, hope that was useful.
Comments
Post a Comment