Smart Assistant with edge AI computing - Summer Internship at Himax Imaging in Irvine (Ongoing)

Smart Assistant with edge AI computing - Summer Internship at Himax Imaging in Irvine (Ongoing)

Tags
Edge AI
Embedded System
Windows Desktop App Dev
Electron
React Native
Computer Vision
Emotion Detection
Task Automation
AIoT
Internet of Things (IoT)
Automatic Speech Processing
LLM
Date
Jun 17, 2024 โ†’ Sep 20, 2024
Created
Jul 2, 2024 02:06 AM
Description
This blog post documents the ongoing summer internship at Himax Imaging in Irvine, focusing on the development and improvement of a computer vision AI model for an AI module product. (Not Finished Yet)
Featured
Featured
Last Updated: โ€ฃ

Introduction

This blog post keeps a record of my summer internship journey at Himax Imaging Corp. lasting for 3 months in 2024 summer. This is the first internship of my lifetime. I was offered a software / machine learning engineer position, the responsibility of which is
  1. to develop and improve the computer vision AI model residing within a small chip (or more formally, an AI module product ISM-028) designed by the other team of our company, and also
  1. to build a laptop application for demonstration of the capability and applicability of our chip.
Holding two master degrees of electrical engineering and computer science respectively, this intern opportunity really fits my background and interest. Especially with my previous project Hand-Writing Robot,
AI-Powered Driver Companion Mobile App
AI-Powered Driver Companion Mobile App
, Deep-Learning based Natural Language Processing research, the experience in software quick prototyping, and deep learning research equips me with a lot of skills and knowledge that I can tap into for this intern.
notion image

Background

This internship project has 3 main goals:
  1. to develop advanced AI-powered features provided by the AI chip ISM-028, our company designed
  1. to optimize models in accuracy and speed
  1. to demonstrate the usability and potential of ISM-028.
ย 

What is an AI PC?

notion image

Endpoint AI vs Edge AI vs Cloud AI

to write

Always-On-Sensing (AOV) and Always-on-Vision (AONV)

to write

Goals

Model Development for More Downstream Computer Vision Tasks

Currently, we already have one vision model that is capable of doing tasks such as face detection and face orientation. We also developed some couples of models doing eye tracking, face landmarking/meshing, posture, and gesture detection. These features are very important in our product. However, we would also like to incorporate higher-level features that are closer to the laptop user scenarios. One of them is Emotion Detection. Just as how Jarvis will know if Tony Starks is struggling with some tasks, this feature enables the other applications to know the userโ€™s emotion and find something for the user even before it was asked to do so. Just imagine being played a soothing background music when it detects your depression when working on your stressing deadline. This is the potential that our AI module could give to the other applications.

Model Optimization

notion image
The other important thing in this project is to improve the model performance and improve the inference speed. Especially the latter is quite important because it has to energy efficient, saving the laptop battery life. One of the ways for such optimization is to compress our models to reduce the time period of the peak usage of the chip, thus saving more power of the laptop. This makes possible that our module provides energy-efficient always-on operation, so that the laptop can effortlessly be able to listen to and see the user without draining their battery quickly.

Windows Desktop Application Development

In order to demonstrate the potential of our product to some of our big clients, the laptop manufacturers, we also need to show why and how our AI modules will benefit your laptop. So, we try to develop a Desktop Application that runs on Windows Operating System in laptop that provides many useful features powered by our AI module.
In order to do that, we have to put into our chip feature Keyword Spotting (KWS) so that the Smart Assistant living inside your desktop could be summoned just by casting the spell โ€œHi WiseEye!โ€. Then, you can directly tell this secretary to do anything for you, or it can tell you some tips to manage the stress when it sees stressed you, for example.

Framework and Tool Decision For Windows Desktop Application Development

There are a couple of frameworks that you can use to develop Windows Desktop Applications. A common one is to use JavaScript with Electron for modern application development. We go for this one because we also need a GUI for this app and the Front-End framework (ReactJS), the rich libraries of the Back-End and active community of JavaScript world gives us the useful third-party packages that you can directly download and use in your application. This improves our development speed and is a perfect fit for the purpose of our demo application development.
Also, if you ran into any issues, you can always find solutions on internet due to its large and powerful community. The only consideration we need to take is that some OS-level features might not be in NodeJS Package Index (npm). It would need C++ or C# with Windows API / Windows App API, or even Win32, WinRT, or UWP API in some rare cases that you are looking for very low-level features like keyboard backlights control.
๐Ÿ’ก
There is also an alternative way to develop a native application with JavaScript, which is React Native. It does not only provides development of iOS, Android this native mobile app, but also for native desktop app. For Windows, it is called React Native for Windows. We did try that, but it still seems buggy, since it React Native only supports Windows since 2023, which is only one year at the moment. Therefore, we switched to using ElectronJS.

System Diagram

notion image

UI Design

โ€ฆ
notion image

Environment Setup for React-Native

Using CLI (Failed)

$ Set-ExecutionPolicy Unrestricted -Scope Process -Force $ iex (New-Object System.Net.WebClient).DownloadString('https://aka.ms/rnw-vs2022-deps.ps1'); $ npx --yes react-native@latest init wise --version "latest" $ cd wise wise\ $ npx --yes react-native-windows-init --overwrite
Failed to launch the application by npx react-native run-windows
The build is successful, but the error message below occurred during deployment:
ร— Failed to deploy: ERROR: ReflectionTypeLoadException: Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information. Command failed. Re-run the command with --logging for more information. ร— Deploying C:/Users/550030/wise/windows/x64/Debug/wise/wise.build.appxrecipe: ERROR: ERROR: ReflectionTypeLoadE... ERROR: Exceptions from the reflection loader: ERROR: FileLoadException: Could not load file or assembly 'NuGet.VisualStudio.Contracts, Version=17.10.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)
  • Failed to deploy. Package NuGet.VisualStudio.Contracts was missing.
    • The package NuGet.VisualStudio.Contracts contains RPC contracts for NuGetโ€™s Visual Studio Service Broker extensibility APIs. These APIs are designed to be usable with async code and are available in this package using Visual Studioโ€™s IServiceBroker.
    • dotnet list windows\wise.sln package also does not work ๐Ÿ˜Ÿ
      • The error shows that SolutionDir environment variable is empty and the node_modules\react-native-windows\package.json could not be found. โ‡’ why is it empty? set the variable as a workaround?

With Visual Studio Code (Failed ๐Ÿ˜Ÿ)

  • install extension
  • create a file in .vscode\launch.json with the configuration for Debug
  • run Debug
ย 

With Visual Studio (It works!)

  • npx react-native autolink-windows
  • build, deploy and run on local machine or just debug

Environment Setup for Electron with React

Frontend framework: React
Package Manager: Yarn
React framework: None, just use bundler Vite
Tooling: Vite
TypeScript transpiler: esbuild by vite instead of tsc
CSS: tailwind
Unit Testing: vitest (jest or react-testing-library)
Integration Testing: Cypress
Plugins:
ย 

Setup for React and Electron with TypeScript

$ yarn add -D vite $ yarn create vite โˆš Select a framework: ยป Others โˆš Select a variant: ยป create-electron-vite โ†— โˆš Project template: ยป React # this will come with TypeScript, React, and Electron, but it has some issues installing electron # using this script. [1/2] โข€ esbuild error C:\Users\550030\wise-electron\node_modules\electron: Command failed. Exit code: 1 Command: node install.js Arguments: Directory: C:\Users\550030\wise-electron\node_modules\electron Output: RequestError: unable to verify the first certificate # to solve this, install electron manually by `yarn add --dev electron` $ yarn install $ yarn dev
The application will be launched by using yarn dev, and a desktop application shows up as follows.
notion image

Setup for Tailwind CSS

โ€ฃ
Check out the other blog post of mine about
Exploring PostCSS and Tailwind CSS: A Modern Approach to CSS
.
Tailwind CSS uses postcss and autoprefixer packages, so we have to install them first too.
$ yarn add --dev tailwindcss postcss autoprefixer # tailwindcss init flags: # --esm Initialize configuration file as ESM # --ts Initialize configuration file as TypeScript # -p, --postcss Initialize a `postcss.config.js` file # -f, --full Include the default values for all options in the generated configuration file $ npx tailwindcss init -p Created Tailwind CSS config file: tailwind.config.js Created PostCSS config file: postcss.config.js
Add paths to the content field of tailwind.config.js
content: [ "./index.html", "./src/**/*.{js,ts,jsx,tsx}", ],
Add the Tailwind directives to your CSS: Add theย @tailwindย directives for each of Tailwindโ€™s layers to yourย ./src/index.cssย file.
@tailwind base; @tailwind components; @tailwind utilities;

Use react-chatbot-kit

โ€ฃ
First, create a subfolder, Chatbot, under src/components/ to keep all chatbot-related files in a single place. In this folder, it contains
Chatbot/ config.js MessageParser.jsx ActionProvider.jsx Chatbot.tsx Chatbot.css
The first three files are suggested by the library.
For config.js ,
import { createChatBotMessage } from 'react-chatbot-kit'; const config = { initialMessages: [createChatBotMessage(`Hello world`)], }; export default config;
ย 
For ActionProvider.jsx, we have to copy paste the following content:
import React from 'react'; const ActionProvider = ({ createChatBotMessage, setState, children }) => { return ( <div> {React.Children.map(children, (child) => { return React.cloneElement(child, { actions: {}, }); })} </div> ); }; export default ActionProvider;
For MessageParser.jsx,
import React from 'react'; const MessageParser = ({ children, actions }) => { const parse = (message) => { console.log(message); }; return ( <div> {React.Children.map(children, (child) => { return React.cloneElement(child, { parse: parse, actions: {}, }); })} </div> ); }; export default MessageParser;
ย 
The Chatbot.tsx is a file that I specifically created for other files to call upon and retrieve the MyChatbot React Component to be used in other places. Inside this file, as suggested by the documentation of react-chatbot-kit, copy paste the following
import Chatbot from 'react-chatbot-kit' import 'react-chatbot-kit/build/main.css' import config from './config.js'; import MessageParser from './Messageparser.js'; import ActionProvider from './ActionProvider.js'; export const MyChatbot = () => {... return ( <div> <Chatbot config={config} messageParser={MessageParser} actionProvider={ActionProvider} /> </div> ); ... };
Then, we can fire up the application by yarn dev, which will run electron. The rendered page in electron is
notion image

Chatbot Integration

I reuse the chatbot I implemented using LangChain.js framework in TypeScript format, so here I only need to integrate the chatbot process with the React UI components. After about one to two hours of working to solve some async issues, the work has been down, mostly the MessageParser.jsx, and the ActionProvider.jsx . Also, I got to solve one existing bug in the original chatbot process I developed, where it has some amnesia problem, by removing the redundant question rewriting model.

The First Natural Command - Set Brightness

Next, because it is the first time I use ElectronJS to develop a Windows Desktop App, I am still not sure if or how it is able to do system control. So I go for implementing a OS-level feature, set screen brightness, to make sure this framework is able to do that.
The first thing to do is to implement a message parsing function that is able to know when it is a command that the assistant has to send some commands to the OS, which I simply use regex first. The second thing to do then is to execute the command, which is where things got tricky.
After 3 to 4 hours of working and debugging, it works fortunately. The key concept here is that in ElectronJS, it has
  • Renderer process that runs within a browser that can not have access to the OS directly (like a virtual environment or a virtual world)
  • Main process that runs on the operating system, just like back-end, which has access to the OS directly, and has controls over the browser that the renderer process runs within and the application lifetime
In order to enable our react app to communicate with the OS, we have to use the feature Inter-Process Communication (IPC) feature provided by ElectronJS in its preload.ts/js along with main.ts/js and renderer.ts/js.
The result is shown below.
notion image

C++ Event Listener

  • FT4222 Library
  • Check EPII_ISP_TOOL
ย 

Django Backend

  • django-rest-framework (DRF)
  • django-filters
  • drf-spectacular : for API documentation with Swagger/Redoc and client generation with OpenAPI

Client Generation by OpenAPI

(wise-backend) wise_backend/ $ python manage.py spectacular > schema.yml wise-electron/ $ openapi-generator-cli generate -i ..\wise_backend\schema.yml -g typescript-fetch -o .\src\services

To Investigate

  1. C/C++ addons for nodejs
    1. rollup
    2. node api
    3. node.h
    4. NAN
  1. C# / .NET as Backend
  1. .NET Core with TypeScript
ย