Open Source AI Finder

Discover the latest open-source models for your projects.

MVP4D

Computer Vision

A high-resolution dataset and a method for estimating 4D scene flow (3D motion of points over time) from multi-view camera inputs.

4D scene flow estimationdynamic scene reconstructionautonomous driving perceptionrobotics3D motion analysis

D2E (Dialog-to-Event)

Natural Language Processing

A dataset and baseline model for open-domain event extraction from conversations, aiming to identify and structure event information discussed in dialogues.

event extractionconversational AIinformation extraction from dialoguedata structuring

RTFM (Worldlabs AI)

Natural Language Processing

A technique that allows a language model to answer questions about a large document by ingesting its entire content into the context, using a leave-one-out attention mechanism, without needing to fine-tune the model.

document question answeringinformation retrievalcontextual QARAG alternativelegal document analysis

TAG

text-to-3d

A model that generates realistic and controllable human actions within 3D scenes based on natural language descriptions.

3D animationrobotics simulationvirtual human generationgame developmentfilm pre-visualization

PhysHSI

Image Generation

A physically-based rendering framework for synthesizing realistic hyperspectral images (HSI), which can be used as training data for other deep learning models.

synthetic data generationhyperspectral imagingscientific researchremote sensingagricultural monitoring

Ring-1T

text-generation

A 1-trillion parameter Chinese-English bilingual large language model, demonstrating strong capabilities in both languages.

bilingual text generationChinese language processingEnglish language processingtranslationcontent creation

UP2You

text-to-image

A training-free method to personalize text-to-image models, enabling the generation of images featuring specific subjects (like a person or pet) from just a few personal photos.

personalized image generationcreating images with specific facescustomizing diffusion modelstraining-free personalization

DeepSomatic

Scientific Understanding

An AI model based on DeepVariant technology that accurately identifies genetic variants in tumors (somatic variants) from DNA sequencing data.

cancer researchgenomic analysistumor variant detectionsomatic mutation callingprecision medicine

DreamOmni2

image-to-3d

A unified diffusion model that generates high-fidelity 3D objects and consistent multi-view images from either a single image or a text prompt.

3D model generationobject creation from imagestext-to-3D synthesismulti-view image generation

StreamingVLM

multimodal

An efficient framework for Large Multimodal Models (LMMs) to process and understand long videos in a streaming fashion, maintaining high accuracy without needing to access the entire video at once.

long video understandingvideo summarizationvideo question answeringreal-time video analysis
Scroll to top