Toole

Check our news chanels!

Dołącz do naszego newslettera!

Blog

FEATURED

Sakana AI drops image models to generate Japan’s traditional ukiyo-e artwork

Remember Sakana AI? Almost a year ago, the Tokyo-based startup made a striking appearance on the AI scene with its high-profile founders from Google and a novel automated merging-based approach to developing high-performing models. Today, the company announced two new image-generation models: Evo-Ukiyoe and Evo-Nishikie.

Available on Hugging Face, the models have been designed to generate images from text and image prompts. However, there’s an interesting and unique catch: instead of handling regular image generation in different styles, these models are laser-focused on Japan’s popular historic art form ukiyo-e. It flourished between the 17th and 19th centuries, and Sakana hopes to bring it back to modern content consumers using the power of AI.

The move comes as the latest localization effort in the AI space — something that has grown over the past year, with companies in countries like South Korea, India and China building models tailored to their respective cultures and dialects.

What to expect from the new Sakana AI models?

Dating back to the early 1600s, Ukiyo-e – or “pictures of the floating world” – evolved as a popular art in Japan focusing on subjects like historical scenes, landscapes, sumo wrestlers, etc. The genre revolved around monochrome woodblock prints but eventually graduated to full-color prints or “nishiki-e” with multiple woodblocks. Its popularity declined in the 19th due to multiple factors, including the rise of digital photography.

Now, with the release of the two image-generation models, Sakana wants to bring the historic artwork back into popular culture. The first one – Evo-Ukiyoe – is a text-to-image offering that generates images closely resembling ukiyo-e, especially when prompted with text inputs describing elements commonly found in ukiyo-e art such as cherry blossoms, kimono or birds. It can even generate ukiyo-e-style art with things that did not exist back then, like a hamburger or laptop, but the company points out that sometimes the results may veer off track — not resembling ukiyo-e at all.

The model is based on Evo-SDXL-JP, which Sakana developed using its novel evolutionary model merging technique on top of Stability AI’s SDXL and other open diffusion models. The company said it used LoRA (Low-Rank Adaptation) to fine-tune Evo-SDXL-JP on a dataset of over 24,000 carefully-captioned ukiyo-e artworks acquired through a partnership with the Art Research Center (ARC) of Ritsumeikan University in Kyoto.

“We curated this data with a wide range of subjects, covering including whole art and face-centered ones, from the digital images of ukiyo-e in the ARC collection. We also focused on multi-colored nishiki-e with beautiful colors while considering diversity,” the company wrote in a blog post.

The second model, Evo-Nishikie, is an image-to-image offering that colorizes monochrome Ukiyo-e prints. Sakana says it can add color to historical book illustrations that were printed in one color of ink or give entirely new looks to existing multi-colored Nishikie prints. All the user would have to do is provide the source image and maybe pair it with a set of instructions describing the elements to be colored.

Sakana said it brought this model to life by performing ControlNet training on Evo-Ukiyoe, using fixed prompts and condition images.

Goal for further research and development

While the models only support prompting in Japanese and are in the very early stages, Sakana hopes the work to teach AI traditional “Japanese beauty” will spread the appeal of the country’s culture worldwide and find applications in education and new ways of enjoying classical literature.

Currently, the company is providing both models and the associated code to get started on Hugging Face. The Python script included in the repository and LoRA weights are available under the Apache 2.0 license.

“This model is provided for research and development purposes only and should be considered as an experimental prototype. It is not intended for commercial use or deployment in mission-critical environments. Use of this model is at the user’s own risk, and its performance and outcomes are not guaranteed,” the company notes on Hugging Face.

So, far Sakana AI has raised $30 million in funding from multiple investors, including by Lux Capital, which has invested in pioneering AI companies like Hugging Face, and also Khosla Ventures, known for investing in OpenAI way back in 2019. Now, with the release of the two image-generation models, Sakana wants to bring the historic artwork back into popular culture. The first one – Evo-Ukiyoe – is a text-to-image offering that generates images closely resembling ukiyo-e, especially when prompted with text inputs describing elements commonly found in ukiyo-e art such as cherry blossoms, kimono or birds. It can even generate ukiyo-e-style art with things that did not exist back then, like a hamburger or laptop, but the company points out that sometimes the results may veer off track — not resembling ukiyo-e at all.

The model is based on Evo-SDXL-JP, which Sakana developed using its novel evolutionary model merging technique on top of Stability AI’s SDXL and other open diffusion models. The company said it used LoRA (Low-Rank Adaptation) to fine-tune Evo-SDXL-JP on a dataset of over 24,000 carefully-captioned ukiyo-e artworks acquired through a partnership with the Art Research Center (ARC) of Ritsumeikan University in Kyoto.

“We curated this data with a wide range of subjects, covering including whole art and face-centered ones, from the digital images of ukiyo-e in the ARC collection. We also focused on multi-colored nishiki-e with beautiful colors while considering diversity,” the company wrote in a blog post.

The second model, Evo-Nishikie, is an image-to-image offering that colorizes monochrome Ukiyo-e prints. Sakana says it can add color to historical book illustrations that were printed in one color of ink or give entirely new looks to existing multi-colored Nishikie prints. All the user would have to do is provide the source image and maybe pair it with a set of instructions describing the elements to be colored. Sakana said it brought this model to life by performing ControlNet training on Evo-Ukiyoe, using fixed prompts and condition images.

Join our AI & tools

news weekly newsletter!

Latest Posts

News 1 (Apps)

TECH OpenAI sends internal memo releasing former employees from controversial exit agreements

OpenAI on Thursday backtracked on a controversial decision to, in effect, make former employees choose between signing a non-disparagement agreement that would never expire, or keeping their vested equity in the company.

Chris
May 23, 2024

News 2 (Products)

Google's AI Feature Suggested Using Glue to Keep Cheese on a Pizza

The tool, which gives AI-generated summaries of search results, appeared to instruct a user to put glue on pizza when they searched "cheese not sticking to pizza."

Chris
May 23, 2024

News 3 (Tutorial)

Meta Creates Group to Advise on AI Products

The Meta Advisory Group is composed of outside advisors that Meta's management team will periodically consult with on strategic opportunities related to our technology and product roadmap.

Chris
May 22, 2024

News 1 (Apps)

A new way to generate basketball analytics through tracking with computer vision and AI

In honor of the playoffs, I’d like to showcase what we’ve been working on here at Nexavision — a new way to generate basketball analytics through tracking with computer vision and AI:

Chris
May 23, 2024

News 2 (Products)

Amazon plans to give Alexa an AI overhaul — and a monthly subscription price

Amazon is planning to unveil a souped-up version of its decade-old voice assistant this year and will charge a monthly fee, sources say.

Chris
May 23, 2024

News 3 (Tutorial)

Adobe brings Firefly AI-powered Generative Remove to Lightroom

Adobe announced on Tuesday the addition of a Generative Remove feature for Lightroom. Built atop Firefly, the GenAI feature makes it possible to seamlessly edit objects out of photos. The feature arrives on Tuesday as early access.

Chris
May 22, 2024

News 4 (Apps)

Humane is looking for a buyer after the AI Pin’s underwhelming debut

The startup apparently thinks it’s worth between $750 million and $1 billion despite the deep software flaws and hardware issues of its first product.

Chris
May 22, 2024

News 5 (Products)

Meta introduces Chameleon, a state-of-the-art multimodal model

The architecture of Chameleon can unlock new AI applications that require a deep understanding of both visual and textual information.

Chris
May 22, 2024

News 6 (Tutorial)

In Seoul summit, heads of states and companies commit to AI safety

Government officials and AI industry executives agreed on Tuesday to apply elementary safety measures in the fast-moving field and establish an international safety research network.

Chris
May 22, 2024
pl_PLPolish