Portfolio

A collection of my work, projects, and publications

12 projects across 6 years

2025(4 projects)

Work2025

Copied!

Knovy is a cross-platform, always-on AI assistant that runs as a desktop toolbar to keep users focused while multitasking across meetings, content, and applications. It provides low-latency, privacy-preserving real-time transcription and contextual AI actions by processing audio locally and continuously understanding on-screen context.

Website • Demo

Project2025

Copied!

Secreterry transforms the way you learn and manage knowledge online. Built with TypeScript and React, this Chrome extension streamlines personal knowledge management by automatically capturing and organizing web content into Notion, cutting manual effort by up to 8x.

Powered by Google Gemini, it doesn’t just save time—it helps you make sense of what you read. Secreterry summarizes content, classifies topics, and extracts meaningful keywords, turning scattered browsing sessions into a structured, evolving knowledge base. By tracking your reading habits and revealing your core interests, it empowers you to understand your digital curiosity, learn intentionally, and build lasting insight from everyday information.

Project2025

Copied!

Kobo2Notion is an Electron application built with TypeScript and React that connects your Kobo e‑reader directly to Notion, automatically extracting bookmarks and highlights and uploading them to your databases. With optional AI‑powered summarization using Google Gemini, it helps readers turn key passages into concise insights without breaking their reading flow.

Designed as a cost‑free, privacy‑conscious alternative to paid services like Readwise, Kobo2Notion gives users full control over their reading data. All highlights are processed locally by default, ensuring data security and ownership. When AI summarization is enabled, only the selected text is sent to Gemini’s cloud API to generate summaries—striking a thoughtful balance between convenience and privacy.

GitHub GitHubGitHub

Project2025

Copied!

Invoice Agent is an intelligent automation system that eliminates the tedious work of manual invoice data entry. Inspired by a real-world need—my brother’s local dessert shop that often deals with handwritten invoices—this project turns a repetitive 5‑minute task into a streamlined 30‑second process, achieving a 10x improvement in efficiency.

Powered by Google Gemini, the system reads invoices from PDFs or images, then uses fuzzy matching algorithms to accurately map items to a product database. A clean, intuitive web interface enables users to quickly review and correct results. Over time, Invoice Agent learns from user feedback, continuously improving its understanding and accuracy.

GitHub

2024(1 project)

Publication2024

Copied!

VM-ASR is a lightweight, dual-stream U-Net model designed for efficient audio super-resolution (ASR), or bandwidth extension. It enhances low-resolution audio (e.g., 8 kHz) by reconstructing missing high-frequency components to produce high-fidelity sound (e.g., 48 kHz). The model integrates Visual State Space (VSS) blocks derived from VMamba to capture both global and local acoustic contexts, while its dual-stream U‑Net architecture separately processes magnitude and phase spectra to improve harmonic accuracy and phase reconstruction.

Extensive evaluation on the VCTK dataset shows that VM‑ASR surpasses state‑of‑the‑art approaches in spectral reconstruction quality across multiple upscaling configurations. Despite its strong performance, it maintains an exceptionally compact design—just 3.01 M parameters and 2.98 GFLOPS—achieving ~27× real‑time processing speed for 16 kHz → 48 kHz upsampling on a Tesla V100 GPU. This combination of accuracy, speed, and efficiency highlights VM‑ASR’s potential for real‑time deployments in telecommunications, speech synthesis, and audio restoration, enabling high‑quality sound enhancement even in resource‑constrained environments.

GitHub • Paper • Demo

2023(1 project)

Project2023

Copied!

Easy Zoom In: Super-Resolution of Cityscapes is a system developed as a final project for the "Computer Vision Practice with Deep Learning (CVPDL)" course at National Taiwan University (Spring 2023). It aims to enhance the visual quality of low-resolution dashcam or monitor video feeds, particularly focusing on improving the details of objects like cars, pedestrians, and riders. This is achieved by combining YOLOv7 for object detection with Latent Diffusion Models (LDM) for super-resolution of the detected and cropped regions, offering a cost-effective alternative to expensive hardware upgrades.

GitHub • Demo

Portfolio

A collection of my work, projects, and publications

12 projects across 6 years

2025(4 projects)

Work2025

Copied!

Website • Demo

Project2025

Copied!

Project2025

Copied!

GitHub GitHubGitHub

Project2025

Copied!

GitHub

2024(1 project)

Publication2024

Copied!

GitHub • Paper • Demo

2023(1 project)

Project2023

Copied!

GitHub • Demo

Portfolio

Knovy @ INTEVIA AICopied!

Secreterry – AI-Powered Web Reading Companion for NotionCopied!

Kobo2Notion - Easily sync your Kobo highlights to NotionCopied!

Invoice Agent - AI-Powered Invoice Automation with n8nCopied!

VM-ASR: A Lightweight Dual-Stream U-Net Model for Efficient Audio Super-ResolutionCopied!

Easy Zoom In: Super-Resolution of CityscapesCopied!

Portfolio

Knovy @ INTEVIA AICopied!

Secreterry – AI-Powered Web Reading Companion for NotionCopied!

Kobo2Notion - Easily sync your Kobo highlights to NotionCopied!

Invoice Agent - AI-Powered Invoice Automation with n8nCopied!

VM-ASR: A Lightweight Dual-Stream U-Net Model for Efficient Audio Super-ResolutionCopied!

Easy Zoom In: Super-Resolution of CityscapesCopied!

Copied!

Copied!

Copied!

Copied!

Copied!

Copied!

Copied!

Copied!

Copied!

Copied!

Copied!

Copied!