Sony's Developer World forum

    • Home
    • Forum guidelines

    Building a custom Image Classification Model

    Spresense
    2
    19
    764
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    This topic has been deleted. Only users with topic management privileges can see it.
    • S
      suburban-daredevil last edited by

      Hello,

      I am trying to build a custom Deep Learning based Image Classification model using Tensorflow and deploy it on Spresense and use it for inference. So I tried to create a project structure similar to the tflmrt_lenet directory. But when I created a new project using , sdk$ ./tools/mkcmd.py my_project, I only get a basic directory structure. This structure has a lot of files missing from that of the /examples/tflmrt_lenet directory. So how to get those missing files? Copy/paste from tflmrt_lenet dir produces some errors.

      Any help for getting started for deploying custom DL based models on Spresense is appreciated

      Thank you

      @KamilTomaszewski

      K 1 Reply Last reply Reply Quote
      • K
        KamilTomaszewski DeveloperWorld @suburban-daredevil last edited by

        Hi @suburban-daredevil

        Please try this:

        1. Copy the whole sdk/apps/examples/tflmrt_lenet folder to a directory with a new name.

        2. Replace all tflmrt_lenet strings in Makefile, Make.defs, Kconfig, tflmrt_lenet_main.c with the new name chosen in step 1. E.g.
          tflmrt_lenet -> new_app
          TFLMRT_LENET -> NEW_APP

        3. Change the file name of tflmrt_lenet_main.c to <new name>_main.c.

        Best Regards,
        Kamil Tomaszewski

        S 1 Reply Last reply Reply Quote
        • S
          suburban-daredevil @KamilTomaszewski last edited by

          Hey @kamiltomaszewski

          Thanks for taking your time out and responding!

          I am in the directory that you have mentioned.

          I am attaching the screenshot of the path I am currently in

          7975e949-19cc-4a0a-9123-8753085b4b92-Screenshot from 2022-01-17 21-42-03.png

          But in the specified path, there is no such directory as tflmrt_lenet

          81b1d752-caa6-48fa-9d78-5b47c6ee82c4-Screenshot from 2022-01-17 21-43-02.png

          But the tflmrt_lenet directory is present in the another location. This contains all the necessary files from the outlook of it

          f24a5c12-ee24-486c-8b92-7bb527686b56-Screenshot from 2022-01-17 21-47-52.png

          dc86c29a-3beb-4d2a-ba0e-51a0dc782e2f-Screenshot from 2022-01-17 21-48-09.png

          8f722348-a4ae-4e86-b096-316ed1759221-Screenshot from 2022-01-17 21-50-04.png

          Are they both the same? Or is this the one you mentioned?

          Thanks

          K 1 Reply Last reply Reply Quote
          • K
            KamilTomaszewski DeveloperWorld @suburban-daredevil last edited by

            Hi @suburban-daredevil

            I am sorry, my mistake. I meant: examples/tflmrt_lenet.

            Best Regards,
            Kamil Tomaszewski

            S 1 Reply Last reply Reply Quote
            • S
              suburban-daredevil @KamilTomaszewski last edited by

              Hi @kamiltomaszewski

              I was just going through the example code for tflmrt_lenet given. There was a function call like the one given below

              int tflm_runtime_forward(tflm_runtime_t *rt, const void *inputs[],
                                       unsigned char input_num);
              

              Here what does the variable "input_num" represent?

              Thanks

              S 1 Reply Last reply Reply Quote
              • S
                suburban-daredevil @suburban-daredevil last edited by

                Hi @KamilTomaszewski

                In the code given below

                int tflm_runtime_output_shape(tflm_runtime_t *rt, unsigned char output_index, unsigned char dim_index)
                

                What does the the variables "output_index" and "dim_index" represent?

                Thanks

                K 1 Reply Last reply Reply Quote
                • K
                  KamilTomaszewski DeveloperWorld @suburban-daredevil last edited by

                  Hi @suburban-daredevil

                  input_num is the number of inputs you defined in the array inputs for your neural network. input_num equals to tflm_runtime_input_num()
                  output_index is index to specify an output. You can check the number of outputs of your neural network using tflm_runtime_output_num()
                  dim_indexis index to specify a dimension. You can check the number of dimensions of your specific output using tflm_runtime_output_ndim()

                  Below is a short code that uses these functions to print information about the output of your neural network:

                    int output_num = tflm_runtime_output_num(&rt);
                    printf("output num: %d\n", output_num);
                    for (int i = 0; i < output_num; i++)
                    {
                      printf("output: %d, size: %d\n", i, tflm_runtime_output_size(&rt, i));
                      printf("output: %d, dim num: %d\n", i, tflm_runtime_output_ndim(&rt, i));
                      for (int j = 0; j< tflm_runtime_output_ndim(&rt, i); j++)
                      {
                        printf("output: %d, dim: %d, shape: %d\n", i, j, tflm_runtime_output_shape(&rt, i, j));
                      }
                    }
                  

                  Best Regards,
                  Kamil Tomaszewski

                  S 1 Reply Last reply Reply Quote
                  • S
                    suburban-daredevil @KamilTomaszewski last edited by suburban-daredevil

                    Hi @kamiltomaszewski

                    Is there any resource / documentation explaining the usage of all the function calls (especially the ones in the runtime.h file) used in the tflmrt_lenet program?

                    Thanks

                    K 1 Reply Last reply Reply Quote
                    • K
                      KamilTomaszewski DeveloperWorld @suburban-daredevil last edited by

                      Hi @suburban-daredevil
                      You can find the TFLMRT documentation here: https://developer.sony.com/develop/spresense/docs/sdk_developer_guide_en.html#_tflm_runtime
                      tflmrt_lenet example is described here:
                      https://developer.sony.com/develop/spresense/docs/sdk_tutorials_en.html#_tflmrt_sample_application
                      You can find the description of the functions in the comments in this file:
                      https://github.com/sonydevworld/spresense/blob/master/sdk/modules/include/tflmrt/runtime.h

                      Best Regards,
                      Kamil Tomaszewski

                      1 Reply Last reply Reply Quote
                      • S
                        suburban-daredevil last edited by

                        Hi @kamiltomaszewski

                        We have the tflmrt_lenet example which was trained on 28x28 grayscale images. What are the changes that has to be made and in what files (like app_main.c, pnm_util.c etc.) to accept and process RGB images of different size (say 96x96) ?

                        Thanks

                        1 Reply Last reply Reply Quote
                        • S
                          suburban-daredevil last edited by

                          Hi @KamilTomaszewski

                          93dd823a-5624-44e2-a595-23bb171d0f6a-Screenshot from 2022-02-24 14-17-03.png

                          I just wanted to know, to embed our model's C-byte array code in our application, is it sufficient to replace the existing model0.c file's array contents with that of our new model?

                          I tried emedding the model0.c file's array contents with that of my new model and also changed the model length. But every time I run inference, for any given input I'm getting the same output. No change in output. Am I missing something?

                          And also model_tflite is the buffer that holds the model.

                          acc5ea7b-e51e-40d7-9d16-0ccc8c3360c0-Screenshot from 2022-02-24 14-28-13.png

                          And network is the variable that holds our NN. It is assigned in the else part of the code block below

                          6652de38-50cc-4c49-ac3e-c9f87943677d-Screenshot from 2022-02-24 14-28-55.png

                          But I think the network variable doesn't read the builtin model and hence it is not able to perform the right inference ?

                          Any help is appreciated

                          Thanks

                          K 1 Reply Last reply Reply Quote
                          • K
                            KamilTomaszewski DeveloperWorld @suburban-daredevil last edited by

                            Hi @suburban-daredevil,

                            You need to change #define MNIST_SIZE_PX (28 * 28) to #define MNIST_SIZE_PX (96 * 96 * 3) in the tflmrt_lenet_main.c file and #define MY_BUFSIZ (28 * 28) to #define MY_BUFSIZ (96 * 96 * 3) in the pnm_util.c file. I think that should be enough.

                            Does your model array have __attribute__((aligned))?

                            Best Regards,
                            Kamil Tomaszewski

                            S 1 Reply Last reply Reply Quote
                            • S
                              suburban-daredevil @KamilTomaszewski last edited by suburban-daredevil

                              Hi @kamiltomaszewski

                              I think you are referring to this right?

                              4e49f22f-9895-427e-a0ee-d1801b404207-Screenshot from 2022-02-25 08-46-35.png

                              I'm not sure how to check if my new model has __attribute__((aligned))

                              Thanks

                              K 1 Reply Last reply Reply Quote
                              • K
                                KamilTomaszewski DeveloperWorld @suburban-daredevil last edited by

                                Hi @suburban-daredevil

                                Yes, that is right.

                                Are you using the model as a C array or as a binary file that you load from the SD card?

                                S 1 Reply Last reply Reply Quote
                                • S
                                  suburban-daredevil @KamilTomaszewski last edited by

                                  Hi @kamiltomaszewski

                                  I have 2 queries

                                  • Currently I'm loading my model from the SD card. But I want to embed my model code onto my application folder itself. Is it enough to replace the contents of the model_tflite[] array with that of my new model or is there any other changes to be made?
                                  • When running the app from the nuttx prompt, I try to give the path of the image that should be used for inference. But currently only 28*28 grayscale images are being accepted. When I try to give images of any other dimensions, it says pgm image load failed . I have made the change that you have suggested above. Are there any other changes to be made?

                                  Thanks

                                  K 1 Reply Last reply Reply Quote
                                  • K
                                    KamilTomaszewski DeveloperWorld @suburban-daredevil last edited by

                                    Hi @suburban-daredevil

                                    It should be enough to replace the contents of the model_tflitearray with that of your new model.

                                    Could you check where exactly the pnm_load function returns an error?

                                    Best Regards,
                                    Kamil Tomaszewski

                                    1 Reply Last reply Reply Quote
                                    • S
                                      suburban-daredevil last edited by suburban-daredevil

                                      Hi @KamilTomaszewski

                                      When I try to run build and flash from VS code, I get the following error. But I do have a file called project_name.nuttx.spx inside my out directory after building.

                                      4eff2a6f-f913-4ace-84d3-5f6684bb10d0-Screenshot from 2022-03-10 17-19-28.png

                                      Can you help me out with this?

                                      Thanks

                                      K 1 Reply Last reply Reply Quote
                                      • K
                                        KamilTomaszewski DeveloperWorld @suburban-daredevil last edited by

                                        Hi @suburban-daredevil,

                                        I think there is a bug in the latest VS code release. Could you please try an older release? For example: https://update.code.visualstudio.com/1.63.2/linux-deb-x64/stable

                                        Best Regards,
                                        Kamil Tomaszewski

                                        S 1 Reply Last reply Reply Quote
                                        • S
                                          suburban-daredevil @KamilTomaszewski last edited by

                                          Hi @KamilTomaszewski

                                          It worked and solved the issue. Thanks for your help

                                          Thanks

                                          1 Reply Last reply Reply Quote
                                          • First post
                                            Last post
                                          Developer World
                                          Copyright © 2021 Sony Group Corporation. All rights reserved.
                                          • Contact us
                                          • Legal