Open Source AI Finder

Discover the latest open-source models for your projects.

MVP4D

Computer Vision

A high-resolution dataset and a method for estimating 4D scene flow (3D motion of points over time) from multi-view camera inputs.

4D scene flow estimationdynamic scene reconstructionautonomous driving perceptionrobotics3D motion analysis

📄 Open Source (MIT License)

Link or repository →

D2E (Dialog-to-Event)

Natural Language Processing

A dataset and baseline model for open-domain event extraction from conversations, aiming to identify and structure event information discussed in dialogues.

event extractionconversational AIinformation extraction from dialoguedata structuring

📄 Open Source (CC BY-NC 4.0)

Link or repository →

A technique that allows a language model to answer questions about a large document by ingesting its entire content into the context, using a leave-one-out attention mechanism, without needing to fine-tune the model.

document question answeringinformation retrievalcontextual QARAG alternativelegal document analysis

📄 Not specified, no code provided

Link or repository →

TAG

text-to-3d

A model that generates realistic and controllable human actions within 3D scenes based on natural language descriptions.

3D animationrobotics simulationvirtual human generationgame developmentfilm pre-visualization

📄 Code to be released

Link or repository →

PhysHSI

Image Generation

A physically-based rendering framework for synthesizing realistic hyperspectral images (HSI), which can be used as training data for other deep learning models.

synthetic data generationhyperspectral imagingscientific researchremote sensingagricultural monitoring

📄 Not specified

Link or repository →

Ring-1T

text-generation

A 1-trillion parameter Chinese-English bilingual large language model, demonstrating strong capabilities in both languages.

bilingual text generationChinese language processingEnglish language processingtranslationcontent creation

📄 Code is Open Source (Apache 2.0), but the model weights are under a custom, more restrictive license agreement.

Link or repository →

UP2You

text-to-image

A training-free method to personalize text-to-image models, enabling the generation of images featuring specific subjects (like a person or pet) from just a few personal photos.

personalized image generationcreating images with specific facescustomizing diffusion modelstraining-free personalization

📄 Code to be released

Link or repository →

DeepSomatic

Scientific Understanding

An AI model based on DeepVariant technology that accurately identifies genetic variants in tumors (somatic variants) from DNA sequencing data.

cancer researchgenomic analysistumor variant detectionsomatic mutation callingprecision medicine

📄 Open Source (BSD 3-Clause)

Link or repository →

DreamOmni2

image-to-3d

A unified diffusion model that generates high-fidelity 3D objects and consistent multi-view images from either a single image or a text prompt.

3D model generationobject creation from imagestext-to-3D synthesismulti-view image generation

📄 Code to be released

Link or repository →

StreamingVLM

multimodal

An efficient framework for Large Multimodal Models (LMMs) to process and understand long videos in a streaming fashion, maintaining high accuracy without needing to access the entire video at once.

long video understandingvideo summarizationvideo question answeringreal-time video analysis

📄 Open Source (MIT License)

Link or repository →

Open Source AI Finder

Signup For The AI Newsletter