Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts

Supplementary Material

 


We recommend watching all videos in full screen. Click on the videos for seeing them in full scale.

 


Movie Weaver supports different reference configuration (Figure 1)

Reference combinations: Please select!

Image 1 Image 2 Image 3 Image 4 Image 5 Image 6

Prompt: The video shows a young man [R1] holding a dog in his arms. The man is standing on a sidewalk in front of a row of buildings. The man is holding the dog in his arms and smiling at the camera. He then looks down at the dog and smiles. He then looks back at the camera and smiles. The camera is static.


Reference combinations: Please select!

Image 1 Image 2 Image 3 Image 4

Prompt: A man [R1] and a woman [R2] are working in a data center with rows of server racks and super computers. They are discussing their work as they check cables and other equipment. The video shows two people in a server room. The woman holds a black tablet in her hand on the left. The man holds a black notebook in his hand on the left. The background is a dark server room with rows of servers and a white ceiling. The woman walks to the left, looking at the man . She smiles and talks to him. The man looks at her and smiles. He talks to her and gestures with his hand on the right. The camera is handheld.

 


Qualitative results of Movie Weaver (Figure 5)

Reference combinations: Please select!

Image 1 Image 2 Image 3 Image 4

Prompt: A man [R1] and a man [R2] Hard Hat Walking, Talking, and Using Tablet Computer. Glass Building or Skyscraper under Construction on Background. Shot on RED Cinema Camera in 4K (UHD).The video shows two men in hard hats and safety vests standing outside a building, looking at a blueprint. They are both looking down at a blueprint in the man's hands. The man on the left is holding a black device in his hand. The background is a large glass building with a metal grid ceiling. The man on the left is looking down at the blueprint, while the man on the right is looking up at the ceiling. The man on the left is pointing at the blueprint with his finger. The camera is handheld.


Reference combinations: Please select!

Image 1 Image 2 Image 3 Image 4

Prompt: The video shows a man [R1] [R2] and a dog [R3] sitting at a table with a Christmas tree in the background. The man is holding a fork in his hand on the left and a glass of wine in his hand on the right. The dog is sitting on the right side of the table. The man is looking at the dog and talking to it. The dog is looking at the man and licking its lips. The man then puts the fork down and reaches out to pet the dog. The dog leans in to the man's hand and licks his hand. The camera is static.


Reference combinations: Please select!

Image 1 Image 2 Image 3 Image 4

Prompt:a woman [R1] [R2] and a man [R3] [R4] eating salad after fitness workout on beach. Multiracial a woman and a man having a break on beach snacking on a vegan takeaway meal of green veggies laughing together.The video shows a woman and a man sitting on the beach, eating salads. They are both sitting on the sand, with their legs crossed and their feet pointed towards the camera. They are both looking at each other and smiling. The woman is holding a salad in her hand on the left, and the man is holding a salad in his hand on the right. The background is a beach with palm trees and a body of water. The sky is overcast. The camera is static.

 


Comparison with state-of-the-art ViDu 1.5 (Figure 6)

Reference images of Movie Weaver(ours) Reference images of Vidu 1.5
Reference 1 Reference 2
Prompt: The video shows a woman [R1] [R2] sitting on a bench in a park, petting a dog [R3]. The woman is sitting on a bench, and she is holding the dog in her lap. The woman is smiling and gently petting the dog's head and back. The background is a grassy field with trees and bushes, and a mountain range in the distance. The sky is blue and clear. The camera is static. Prompt: The video shows a woman sitting on a bench in a park, petting a dog. The woman is sitting on a bench, and she is holding the dog in her lap. The woman is smiling and gently petting the dog's head and back. The background is a grassy field with trees and bushes, and a mountain range in the distance. The sky is blue and clear. The camera is static.

Reference images of Movie Weaver(ours) Reference images of Vidu 1.5
Reference 1 Reference 2
Prompt: A man [R1] [R2] and a man [R3] [R4] working and taking notes together in table of a little office.Freelancer meeting, man with laptop drink coffee in coworking.The video shows two men sitting on a black leather couch, looking at a book. The men are sitting on a black leather couch, with a small table in front of them. The man on the left is holding a book, and the man on the right is holding a laptop. The background is a window with a view of the outside. As the video progresses, the man on the left flips the pages of the book, and the man on the right looks at the laptop. The camera is static. Prompt: A man and a man working and taking notes together in table of a little office. Freelancer meeting, man with laptop drink coffee in coworking.The video shows two men sitting on a black leather couch, looking at a book. The men are sitting on a black leather couch, with a small table in front of them. The man on the left is holding a book, and the man on the right is holding a laptop. The background is a window with a view of the outside. As the video progresses, the man on the left flips the pages of the book, and the man on the right looks at the laptop. The camera is static.

 


Ablation Study (Figure 7)

w/o Anchored Prompts (AP), w/o Concept Embeddings (CE) w/o Anchored Prompts (AP), w/ Concept Embeddings (CE) w/ Anchored Prompts (AP), w/ Concept Embeddings (CE)
Reference 1 Reference 2 Reference 3

 


Limitations (Figure 8)

Limitation 1: Reference images can dominate the generation, leading to reduced motion and poor prompt alignment. Limitation 2: Movie Weaver struggles to generalize to configurations not seen during training.
Reference 1 Reference 2
Prompt: A man [R1][R2] and another man [R3][R4] are playing basketball. Prompt: A man [R1] in white T-shirt, a man [R2] in black leather jacket and a man [R3] in yellow shirt are talking.

 


Reference order change (Figure 1 of Appendix)

Prompt: A man wearing a black leather jacket and sunglasses [R1] is sitting on a motorcycle next to a man in a yellow T-shirt [R2]. The video shows the man in the black leather jacket revving the motorcycle's engine, creating a deep, throaty sound. The man in the yellow T-shirt stands beside him, checking his phone and occasionally glancing at the motorcycle. They are parked in an urban alleyway, with graffiti-covered walls and the distant hum of city life in the background. The man on the motorcycle adjusts his sunglasses and nods towards the street, suggesting they are about to leave. The man in the yellow T-shirt pockets his phone and gives a thumbs-up. The camera captures the scene with a slight zoom, focusing on the anticipation of their upcoming ride.

Reference 1 Reference 2
Reference 1 Reference 2