- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.9k
Description
Hi,
- I'd like to train a model to make depth estimation on monocular rgb picture.
 I think this can be done though regression with resnet or densenet.
I have a dataset ( https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html ) with pairs of pictures (input / result needed) :
Rgb_img_1 / depth_img_1
And i have an Excel file with each path to each files.
I started with the multiclassification tutorial ( https://docs.microsoft.com/fr-fr/dotnet/machine-learning/tutorials/image-classification ) but now, i have to translate it to a regression model as i'm searching for depth values for each pixel of a picture.
I know that i have to change my model generation :
`
public static ITransformer GenerateModel(MLContext mlContext)
{
        IDataView trainingData = mlContext.Data.LoadFromTextFile<ImageData>(path: _trainTagsCsv, separatorChar: ',', hasHeader: false);
        IEstimator<ITransformer> pipeline = mlContext.Transforms.LoadImages(outputColumnName: "input", imageFolder: _imagesFolder, inputColumnName: nameof(ImageData.InputImagePath))
              // The image transforms transform the images into the model's expected format.
              .Append(mlContext.Transforms.ResizeImages(outputColumnName: "input", imageWidth: InceptionSettings.ImageWidth, imageHeight: InceptionSettings.ImageHeight, inputColumnName: "input"))
              .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "input", interleavePixelColors: InceptionSettings.ChannelsLast, offsetImage: InceptionSettings.Mean))
              .Append(mlContext.Model.LoadTensorFlowModel(_inceptionTensorFlowModel)
              .ScoreTensorFlowModel(outputColumnNames: new[] { "softmax2_pre_activation" }, inputColumnNames: new[] { "input" }, addBatchDimensionInput: true))
              .Append(mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: "LabelKey", inputColumnName: "Label"))
              .Append(mlContext.MulticlassClassification.Trainers.LbfgsMaximumEntropy(labelColumnName: "LabelKey", featureColumnName: "softmax2_pre_activation"))
              .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabelValue", "PredictedLabel"))
              .AppendCacheCheckpoint(mlContext);
        ITransformer model = pipeline.Fit(trainingData);
        IDataView testData = mlContext.Data.LoadFromTextFile<ImageData>(path: _testTagsCsv, hasHeader: false);
        IDataView predictions = model.Transform(testData);
        // Create an IEnumerable for the predictions for displaying results
        IEnumerable<ImagePrediction> imagePredictionData = mlContext.Data.CreateEnumerable<ImagePrediction>(predictions, true);
        DisplayResults(imagePredictionData);
        MulticlassClassificationMetrics metrics = mlContext.MulticlassClassification
            .Evaluate(predictions, labelColumnName: "LabelKey", predictedLabelColumnName: "PredictedLabel");
        Console.WriteLine($"LogLoss is: {metrics.LogLoss}");
        Console.WriteLine($"PerClassLogLoss is: {String.Join(" , ", metrics.PerClassLogLoss.Select(c => c.ToString()))}");
        return model;
    }
`
Could you tell me where can i find docs and ressources to understand :
- how to choose and use a appropriated model
- how to transform my inputs to make it usable for the model.
- 
Also, i have a .onnx of densenet. would it be easier to go this way instead of using a ml.net model ? (but i'd like to deeply understand ml.net framework) 
- 
Also i took a look on autoMl but i dont think it can resolve my regression problem with images input. Is this right ? 
Thanks,