论文标题:
期刊:
作者:huaiyu zhou , shuangbin xiang
发表时间:15 apr 2024
doi:
微信链接:
注:本文为删减版,不可直接引用。原中英文全文刊发于《景观设计学》(landscape architecture frontiers)2024年第12卷第2期“面向关键挑战的智慧化景观设计”。
导 读
人工智能(ai)图像生成技术正在改变景观设计中的传统工作模式,其中,“图生图”式生成对抗网络(gan)技术具备辅助方案设计的潜能,因此面向用户端对其展开技术适用性评价研究对于优化工具选择、提升设计效率尤为重要。本研究旨在借助图像分析和用户调查方法,评估gan生成方法生成结果的质量、与设计工作对接的有效性,以及景观设计师对图像生成结果的接受度。研究以pix2pix–bicyclegan工作流中布局生成与平面渲染两项任务为评价对象,建立了基于地块数量的绝对/欧式距离、直方图距离、结构相似性指数等图像分析指标;针对gan生成结果的视觉真实性和色彩肌理偏好开展了两项在线用户问卷调查。结果显示,gan生成布局与真实布局相似性高,gan渲染平面能够满足概念方案呈现要求、用户接受度好。最后,本文探讨了gan生成方法的内在合理性及其在行业伦理及数据偏见方面的局限性,反思现阶段连接ai辅助设计与循证设计之间的技术空缺。
关键词
景观设计学;图像生成;生成对抗网络;人工智能辅助设计;适用性评价;景观平面
人工智能“图生图”式景观平面生成技术的适用性评价与反思
applicability evaluation and reflection on artificial intelligence-based “image to image generation of landscape architecture masterplans
1 湖南大学建筑与规划学院建筑系
2 清华大学建筑学院景观学系
3 北京市市政工程设计研究总院有限公司建筑院景观室
01 引言
在近年来兴起的生成式人工智能(ai)热潮中,迅速发展、升级的图像生成技术和制图工具不断冲击着传统景观设计行业的工作模式。目前可以对接景观设计工作流的图像生成技术主要被应用于平面生成及效果图渲染两方面。
平面生成的相关研究主要基于“图生图”式生成对抗网络(gan)开展。这类工具以建筑户型平面生成为起点,目前已经发展至建筑排列方式与体块关系的生成。近年来,景观设计领域也开启了平面生成的研究,但仍存在以下问题——缺乏公开可获取的景观平面数据集,训练数据丰富性较低;可生成平面的尺度有限,主要适用于中小型绿地;针对gan所生成平面的系统化定量评价较少,缺乏便于操作的评价指标;针对用户端开展的调查较少,难以获取使用评价。
效果图生成的相关研究与应用主要围绕midjourney和stable diffusion两大“文字生图”(text to image)工具开展。相比之下,开源的stable diffusion模型除了可以通过关键词生成图片外,还具备“图生图”(image to image)和“模型生图”(model to image)的自由训练功能,目前,基于stable diffusion的建筑形体构思和建模工作流已经初步形成。
本研究关注基于gan的景观平面生成方法,从景观设计师的视角综合评估其技术适用性,以期为设计师在选择工具时提供决策依据;旨在借助图像分析和用户调查方法,评估gan生成方法生成结果的质量、与设计工作对接的有效性,以及景观设计师对图像生成结果的接受度。
02 评价对象
本研究着眼于pix2pix–bicyclegan景观平面生成工作流中两项关键任务——布局生成与平面渲染——的适应性评价。gan生成的布局类似设计教学中的功能泡泡和平面草图,是设计迭代和调整的基础;gan渲染图则为布局中抽象的形态添加了色彩和肌理细节而使其更具可读性。任务实现工具pix2pix是gan领域应用较为广泛的模型,而bicyclegan是cyclegan的改进模型。由于数据集中获取与标注的平面类型有限,这一工作流目前主要适用于中小尺度的景观场地。
pix2pix–bicyclegan工作流中的布局生成与平面渲染示例 © 周怀宇,向双斌
生成多种样式风格的布局
通过向pix2pix模型输入场地范围,可以生成多种风格且包含不同用地类型的场地布局。评价围绕生成的用地地块布局与真实布局的相似性和视觉真实性开展。
在本研究中共收集2725张真实景观平面图,其中混合、曲线、折线、有机混合训练集分别为2670、916、770、954张,预留用于评估生成效果的验证集85张。基于4种样式风格,共得到340个gan生成布局用于后续评价。设计师在比较多个gan生成布局后,依据项目需求并结合个人经验形成更为精准的地块布局,并将其作为平面渲染任务的输入。
渲染多种色彩肌理的平面
设计师将调整后的布局输入bicyclegan中,可获取不同色彩肌理的渲染平面,方便与业主快速沟通设计思路。该任务的评价主要围绕gan渲染平面与人工渲染平面的相似性及用户色彩肌理偏好开展。数据集共包含景观平面325张,其中训练集300张,验证集25张。每张布局挑选暖色调、冷色结果各一张,评价总量为50张。
03 评价方法
本研究耦合图像分析与用户调查建立评价指标体系。
图像分析指标
生成布局评价指标
所生成五类用地的地块数量(bn)能够最直观地反映gan生成布局的形态多样性,相应的地块数量距离(bnd)可用于评估340张由pix2pix生成的验证集布局和真实布局的差异。其中,bnd评价包含绝对bnd和欧氏bnd两项指标的计算。本研究通过绝对距离比较单一样式风格下生成布局中各类用地bn与真实布局之间的差值。同时,本研究通过绝对bnd与欧式bnd的聚合分析比较了四类样式风格之间地块划分聚集程度的差异,并以聚合图呈现两组数据的中点聚集区。
图像直方图可显示图像中不同rgb像素的频率分布,直方图距离(histd)则是衡量两幅图像之间像素分布差异的关键指标,能够有效评估gan生成布局与真实布局在用地地块划分与面积比例上的差异。其中,histd的取值范围为[0,1],取值小于0.5代表二者总体呈现相似趋势。
地块数量距离与直方图距离方法示意 © 周怀宇,向双斌
平面渲染评价指标
结构相似性指数(ssim)是一种广泛使用的图像相似性度量工具,可以评估两幅经过不同处理加工的同源图像(x, y)之间的感知差异。本研究通过计算ssim来评估渲染平面与景观设计师人工渲染平面的差别。ssim的取值范围为[0,1],其中,1表示两幅图像具有相同的结构,0则表示完全不同。此外,上述histd指标也被并纳入平面渲染评价指标。
用户调查指标
为了评价gan生成布局在视觉上能否以假乱真,同时了解从业人员的色彩肌理偏好,研究团队于2023年9月1日至10月31日,面向景观设计及相关领域的教师、学生和职业设计师发布了两项问卷星在线调查问卷。问卷主要被投放到湖南大学建筑与规划学院、清华大学建筑学院及北京市市政工程设计研究总院,同时要求受访者选择其求学或从业年限以确保结果的代表性与可靠性。
问卷1
问卷1旨在对gan生成布局进行图灵测试,并评估从业人员对gan生成布局的接受度。问卷中共涉及16张随机抽取自验证集的pix2pix自动生成布局,14张由知名事务所或大师创作方案的布局改绘,受访者需要从中选出他们所认为的由ai生成的图片,问卷并未设置最多可选数量限制。
在线调查问卷1(橙色编号代表gan生成布局) © 周怀宇,向双斌
问卷2
问卷2旨在判断从业人员对几类主流gan模型渲染图接受度的差异。问卷提供了30张渲染平面图(10组、每组3张,分别来自pix2pix、cyclegan和bicyclegan),要求受访者判断渲染图是否达到在概念设计阶段用于方案交流的标准,并依据色彩和肌理选择每组中效果最佳的平面。
在线调查问卷2 © 周怀宇,向双斌
04 评价结果
图像分析
生成布局评价结果
总体对比gan生成布局与真实布局发现,两者在图形统计意义上的bn多样性水平接近,地块面积比例相似性突出。
1)由绝对bnd平均值计算结果(表1)可知,单张布局中,gan生成的五类用地bn与真实布局的差别均小于5个,主要差别体现在小品构筑物的数量上,这表明gan与设计师在用地划分时表现出的多样性较为相似。
2)为了确定4类样式训练集的布局数量差异是否会导致bnd结果的显著不同,进一步对四类风格、五类用地地块的绝对bnd与欧氏bnd进行聚合分析。结果显示,4种样式风格在用地划分上也具有较强的相似性,进而可知训练集数量的不同并没有显著影响训练结果。
gan生成布局与真实布局的绝对bnd与欧氏bnd聚合图 © 周怀宇,向双斌
3)混合、曲线、直线、有机四类样式风格的平均histd值分别为0.41、0.45、0.41、0.43,均小于0.5,意味着gan生成布局对不同用地类型划分的面积比例总体与真实布局呈现接近趋势。
平面渲染评价结果
计算50张渲染图的平均ssim和histd值,结果如表2所示。总体来说,分析结果表明gan渲染平面在像素分布、结构、对比度和亮度方面与职业设计师绘制的渲染图高度相似。
用户调查
问卷1结果
问卷1共收到192份有效回复,55%的受访者有5年以上的从业经历,保证了结果的可靠性。结果显示,16张gan生成布局被识别为ai生成的平均概率为54.7%,略高于随机猜测的概率。而gan生成布局有约45%的几率被从业人员错认为是设计师创作的布局。同时,设计师创作的真实布局有约25%的概率被判定为gan生成。总体而言,gan生成布局可以使一些受访者感到迷惑,同时,约70%的受访者认为gan技术有助力方案设计的潜力。
gan生成布局与真实布局被识别为ai生成的平均概率 © 周怀宇,向双斌
研究进一步通过电话、微信、邮件等形式与受访者交流如何辨别布局是由ai生成还是设计师绘制,发现功能设计中不合理的细节会严重破坏gan方案的视觉真实性。研究将gan布局缺陷分为三类:1)入口不完整;2)道路不连贯;3)节点不可达。其中,道路不连贯的问题最为明显。
gan生成布局的三类缺陷示例及设计师调整方案 © 周怀宇,向双斌
问卷2结果
问卷2共收到422份有效问卷,受访者中55%具有景观设计专业背景, 37%位具有5年以上的从业经历。结果显示,91%的受访者认为gan平面渲染的质量可以满足概念设计阶段的方案推敲与沟通,47%的受访者认为bicyclegan在色彩和肌理效果上表现最佳。
05 结论与讨论
本文通过引入图像分析及用户调查指标来评估gan生成方法的技术适用性,旨在填补现有研究主要关注训练方法而缺少后期评估的空白,为“图生图”生成式设计研究提供易于操作的评价框架。图像分析结果显示,gan生成布局与真实布局的用地分布多样性、渲染平面图与设计师渲染平面图的相似性均达到了较高的水平;用户调查结果显示,gan生成布局具有较强的迷惑性、真假难辨,且渲染色彩和肌理得到了景观设计师的认可。即使gan生成方法模型内部存在较多的黑箱过程,但本研究为其内在逻辑的合理性提供了定量化支撑。
本研究的局限性主要体现在以下几个方面。首先,研究未涉及对gan生成方法的伦理评价。通常而言,设计需要基于特定的地域环境背景来实现功能需求,而gan生成方法往往对复杂历史、文化因素影响下的形式符号缺乏理解。本研究的问卷调查缺少对gan生成方法原创性的关注,需在未来的研究中补充收集用户对伦理问题的看法。其次,在本研究建立的评价框架中,未纳入对gan生成布局多样性和训练数据偏见问题的考量。ai工具输出的内容受其训练数据影响显著,而目前景观平面数据集多样性严重不足,盲目应用会导致设计成果的同质化,未来亟需探索如何避免潜在的设计多样性缺失。再者,本研究评价的pix2pix-bicyclegan工作流虽然具有一定的典型性,但尚不能代表最前沿的技术迭代。在未来的研究中,可探索针对特定区域或类型景观设计(如中国古典园林与西方现代景观)的定制化gan模型,在模型训练过程中融入具有更多地域特征的数据,以及开发能够辨识并强调这些特征的算法。
此外,gan生成方法较低的可解释性使其面临着来自循证设计的挑战。形态只是设计的一方面,而“设计结合自然”的科学思维要求综合叠加各项因子(如竖向、土壤、径流和植被等)以论证设计决策的合理性。因而,如何连接gan模型代表的形态表达与物理模型代表的定量分析是ai深度融入设计学科必然要克服的问题。随着gan生成布局多样性的提升,未来利用多目标优化算法对其进行筛选、优化将有助于提升设计决策的科学性。而随着生成算法的更新,物理模型及优化算法将有可能逐步与ai模型融合,显著提升gan生成方法的可解释性和应用深度。
参考文献
[1] bao, r. (2019). research on intellectual analysis and application of landscape architecture based on machine learning. landscape architecture, 26(5), 29–34.
[2] zhao, j., & cao, y. (2020). review of artificial intelligence methods in landscape architecture. chinese landscape architecture, 36(5), 82–87.
[3] zhao, j., chen, r., hao, h., & shao, z. (2021). application progress and prospect of machine learning technology in landscape architecture. journal of beijing forestry university, 43(11), 137–156.
[4] huang, w., & zheng, h. (2018). architectural drawings recognition and generation through machine learning. in: p. anzalone, m. d. signore, & a. j. wit (eds.), proceedings of the 38th annual conference of the association for computer aided design in architecture (pp. 18–20). arcadia.
[5] nauata, n., chang, k. h., cheng, c. y., mori, g., & furukawa, y. (2020). house-gan: relational generative adversarial networks for graph-constrained house layout generation. in: a. vedaldi, h. bischof, t. brox, & j.-m. frahm (eds.), computer vision—eccv 2020: 16th european conference, glasgow, uk, august 23–28, 2020, proceedings, part i (pp. 162–177). springer.
[6] newton, d. (2019). deep generative learning for the generation and analysis of architectural plans with small datasets. proceedings of 37th ecaade and 23rd sigradi conference, (2), 21–28.
[7] chen, m., zheng, h., & wu, j. (2022). computational design of multi-functional system based on generative adversarial networks: taking the layout generation of vocational and technical college as an example. architectural journal, (s1), 103–108.
[8] lin, w. (2020). research on automatic generation of primary school schoolyard layout based on deep learning [master’s thesis]. south china university of technology.
[9] sun, c., cong, x., & han, y. (2021). generative design method of forced layout in residential area based on cgan. journal of harbin institute of technology, 53(2), 111–121.
[10] zhang, t. (2020). experiments on generation of the arrangement of residential groups based on deep learning [master’s thesis]. nanjing university.
[11] zhou, h., & liu, h. (2021). artificial intelligence aided design: landscape plan recognition and rendering based on deep learning. chinese landscape architecture, 37(1), 56–61.
[12] qu, g., & xue, b. (2022). generative design method of landscape functional layout in residential areas based on conditional generative adversarial nets. low temperature architecture technology, 44(12), 5–9, 14.
[13] chen, r., & zhao, j. (2023). generation and design feature recognition of landscape architecture scheme based on style-based generative adversarial network. landscape architecture, 30(7), 12–21.
[14] zhao, g. (2023). research on application of generative model in landscape design [master’s thesis]. shanxi university.
[15] zhou, w. (2023). design research on pocket park plan layout generation based on deep learning [master’s thesis]. chongqing jiaotong university.
[16] huang, y., & zhou, y. (2023). exploration on the generative architecture design method with aigc technology: a case of the overall design process of generating architectural image with prompt as a key word. urbanism and architecture, 20(15), 202–206, 213.
[17] chen, j., shao, z., & hu, b. (2023). generating interior design from text: a new diffusion model-based method for efficient creative design. buildings, 13(7), 1861.
[18] turrin, m., von buelow, p., & stouffs, r. (2011). design explorations of performance driven geometry in architectural design using parametric modeling and genetic algorithms. advanced engineering informatics, 25(4), 656–675.
[19] isola, p., zhu, j.-y., zhou, t., & efros, a. a. (2017). image-to-image translation with conditional adversarial networks. proceedings of the ieee conference on computer vision and pattern recognition (pp. 1125–1134). ieee.
[20] zhu, j.-y., zhang, r., pathak, d., darrell, t., efros, a. a., wang, o., & shechtman, e. (2017). toward multimodal image-to-image translation. proceedings of the 31st international conference on neural information processing systems (pp. 465–476). curran associates inc.
[21] zhu, j.-y., park, t., isola, p., & efros, a. a. (2017). unpaired image-to-image translation using cycle-consistent adversarial networks. proceedings of the ieee international conference on computer vision (pp. 2223–2232). ieee.
[22] cha, s.-h., & srihari, s. n. (2002). on measuring the distance between histograms. pattern recognition, 35(6), 1355–1370.
[23] hore, a., & ziou, d. (2010). image quality metrics: psnr vs. ssim. 2010 20th international conference on pattern recognition (pp. 2366–2369). ieee.
[24] geman, d., geman, s., hallonquist, n., & younes, l. (2015). visual turing test for computer vision systems. proceedings of the national academy of sciences, 112(12), 3618–3623.
[25] zhu, y. (2022). disordering and redirecting: paradigm of design thinking in contemporary landscape architecture. world architecture, (11), 36–37.
[26] jiang, f., ma, j., webster, c. j., li, x., & gan, v. j. (2023). building layout generation using site-embedded gan model. automation in construction, (151), 104888.
[27] li, p., liu, b., & gao, y. (2018). an evidence-based methodology for landscape design. landscape architecture frontiers, 6(5), 92–101.
[28] yang, y., & lin, g. (2020). the development, connotations, and interests of research on landscape performance evaluation for evidence-based design. landscape architecture frontiers, 8(2), 74–83.
[29] zhou, h., jiang, h., & liu, h. (2021). process visualization and performance evaluation of stormwater management in landscape projects based on iot online monitoring. chinese landscape architecture, 35(10), 29–34.
[30] zhou, h., & liu, h. (2021). iot-based operational information management for built landscape projects: from vacancy to approaches. landscape architecture frontiers, 9(2), 83–95.
[31] zhou, h., li, r., liu, h., & ni, g. (2023). real-time control enhanced blue-green infrastructure towards torrential events: a smart predictive solution. urban climate, (49), 101439.
[32] li, h., zhang, z., liu, k., chen, w., wei, w., liu, x., xie, j., zhang, m., huang, z., zhong, m., cai, c., huang, x., hou, y., lin, x., yu, s., fang, y., & feng, x. (2023, november 25). toward dynamic optimization: combining ai and ebhdl for the elderly. american society of landscape architects.
[33] chen, c., li, h., hou, y., & liu, j. (2023). application progress of computer vision in the research on relationship between landscape and health. landscape architecture, 30(01), 30–37.
[34] liu, h., jin, c., & yang, y. (2023). study on the programming language and its organicity of architectural generative design. urbanism and architecture, 20(5), 182–186.
system proposal of globally important agricultural heritage system.
[13] jin, l. (2012). the theoretic roots of ecomuseums: the birth, development and practice of ecomuseums in china. ecological economy, (9), 180–185.
[14] lin, x. (2020). holistic protection: values, concepts, practices, and challenges—some thoughts on the innovation of cultural heritage conservation. fujian tribune (humanities and social sciences edition), (12), 36–47.
[15] wu, x., & wu, q. (2021). from community presentation to panoramic integration: the transformation of the perspective of china’s eco-museum construction. journal of north minzu university (philosophy and social science), (5), 16–24.
[16] borrelli, n., & davis, p. (2012). how culture shapes nature: reflections on ecomuseum practices. nature and culture, 7(1), 31–47.
[17] yin, k. (2017). multiple-senses of place: an approach to ecomuseum. folklore studies, (5), 21–28, 158.
[18] davis, p., & li, m. (2019). places, cultural exhibition sites and ecomuseums. chinese museum, (1), 36–42.
[19] ohara, k. (2005). ecomuseums in japan today. chinese museum, (3), 58–62.
[20] wang, m., chen, z., & zhou, h. (2021). a feasibility study on the construction of eco-museum in the villages of she people in eastern fujian province—a case study of the village of banyueli in xiapu county. fujian wenbo, (9), 39–44.
[21] min, q., he, l., & sun, y. (2009). museum construction is an important content of agricultural cultural heritage protection: summary of the symposium on agricultural cultural heritage protection and rural museum construction. ancient and modern agriculture, (4), 114–116.
[22] meng, f., su, d., fang, l., & an, l. (2017). construction of ecology museum and development of ethnic culture: a case study of suojia ecology museum. journal of original ecological national culture, 9(4), 128–140.
[23] architectural history institute of china architecture design & research group., & china urban development planning & design consulting. (2022). protection plan of nanxun mulberry-dyke & fish-pond system of huzhou city by cultural relics protection units (2022–2035) (draft). people’s government of nanxun district, huzhou city, zhejiang province.
[24] zhejiang university. (2013). protection and development plan of nanxun mulberry-dyke & fish-pond system in huzhou of zhejiang province. people’s government of nanxun district, huzhou city, zhejiang province.
[25] gu, x., lou, l., liu, m., & min, q. (2018). review and prospect of studies on the dyke-pond system. journal of natural resources, 33(4), 709–720.
[26] shen, w. (2019). a research into the mulberry-based fish pond system on protective inheritance and innovational development in modern times—taking yunhao family farm in nanxun, huzhou, zhejiang as an example. research on heritages and preservation, 4(1), 34–38.
[27] justice bureau of huzhou city. (2023). announcement on public solicitation of opinions for huzhou mulberry-dyke & fish-pond system protection regulations (draft for review).
[28] ren, y., li, c., & zhang, s. (2023). the conservation of the mulberry fish pond as a cultural relic. heritage, (1), 183–194.
[29] qin, h. (2017). narrative interpretation of the cultural value of canal heritage—taking tongzhou canal in beijing as an example. journal of beijing union university (humanities and social sciences), 15(4), 22–27.
[30] li, x. (2019). inheritance and practice of spatial pattern of traditional settlement landscape based on regional landscape in the south of taihu basin—taking the landscape planning design of nanxun-qidu as an example [master’s thesis]. beijing forestry university.
[31] china academy of urban planning and design., & cultural relics and archaeology institute of zhejiang province. (2011). heritage conservation plan for the zhejiang section of the grand canal.
[32] chinese academy of agricultural sciences., & taihu region agricultural history research group of chinese agricultural heritage research office of nanjing agricultural university. (1990). draft of the agricultural history of the taihu lake region (pp. 181–182, 422). china agriculture press.
[33] wang, c., & zhang, r. (2020). a study on the interpretation system of cultural heritage: a case study of the great wall of ming dynasty in beijing. journal of capital normal university (social sciences edition), (1), 139–149.
[34] icomos. (2017). charter for the interpretation and presentation of cultural heritage sites.
[35] national cultural heritage administration of china. (2013). nomination document for the grand canal of china as a world heritage site.
[36] yin, k. (2019). ecomuseum: thought, theory and practice (p. 64). science press.
[37] peng, z., qin, h., guo z., zhou, k., huang, h., sun, m., zhang, r., li, y., liang, j., & zhou, x. (2023). interpretation and presentation: the contemporary construction and expression of multiple values of cultural heritage. chinese cultural heritage, (3), 4–28.
[38] qin, h. (2018). the interpretation path of the architectural heritage’s cultural value from the perspective of narration (ii). huazhong architecture, 36(4), 1–3.
[39] qin, h. (2018). the interpretation path of the architectural heritage’s cultural value from the perspective of narration (i). huazhong architecture, 36(3), 19–22.
[40] chen, b., qiu, z., usio, n., & nakamura, k. (2018). tourism’s impacts on rural livelihood in the sustainability of an aging community in japan. sustainability, 10(8), 2896.
[41] davis, p., gong, s., & mai, x. (trans.). (2023). ecomuseums: a sense of place. science press.
本文引用格式 / please cite this article as
zhou, h., & xiang, s. (2024). applicability evaluation and reflection on artificial intelligence-based "image to image" generation of landscape architecture masterplans. landscape architecture frontiers, 12(2), 58–67.
阅读原文请点击“
《前沿》系列英文学术期刊
由教育部主管、高等教育出版社主办的《前沿》(frontiers)系列英文学术期刊,于2006年正式创刊,以网络版和印刷版向全球发行。系列期刊包括基础科学、生命科学、工程技术和人文社会科学四个主题,是我国覆盖学科最广泛的英文学术期刊群,其中12种被sci收录,其他也被a&hci、ei、medline或相应学科国际权威检索系统收录,具有一定的国际学术影响力。系列期刊采用在线优先出版方式,保证文章以最快速度发表。
中国学术前沿期刊网
特别声明:本文转载仅仅是出于传播信息的需要,并不意味着代表本网站观点或证实其内容的真实性;如其他媒体、网站或个人从本网站转载使用,须保留本网站注明的“来源”,并自负米乐app官网下载的版权等法律责任;作者如果不希望被转载或者联系转载稿费等事宜,请与我们接洽。