Purpose: The application of IoT is finding continuous acceptance in our daily lives, particularly, smart speakers are making life easier and convenient for consumers. This research aims to develop and test an integrated model of factors influencing consumer's adoption of voice-enabled IoT devices. Design/methodology/approach: Based on the VAM, an integrated voice-enabled IoT device adoption model is proposed. Gender differences on five constructs relating with perceived value (perceived usefulness, perceived enjoyment, perceived security risk, perceived technicality and perceived cost) was also examined through PLS-MGA technique. The usage experience of consumers was also controlled in the integrated VAM. Findings: Result shows that Perceived-Usefulness, Perceived-Enjoyment and Perceived-Cost have a strong effect on Perceived-Value. However, Perceived-Technicality and Perceived-Security-Risk are non-influential and have no significant effect on PV. Additionally, Perceived-Value and Social-Influence plays a significant role in predicting adoption intention. Gender differences also exist in consumers perception of usefulness, enjoyment and cost. In comparison to the basic value-based adoption model, the integrated model provides more insight on consumers adoption of voice-enabled IoT devices. Originality/value: Using an integrated model, this study is one of the first scholarly attempt at modelling the influential factors for adopting smart speakers i.e., voice-enabled IoT devices, with implications for improved adoption.