- Notifications
You must be signed in to change notification settings - Fork9
Devinterview-io/python-ml-interview-questions
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
You can also find all 100 answers here 👉Devinterview.io - Python ML
Python 2.7 andPython 3.x are distinct versions of the Python programming language. They have some differences in syntax, features, and library support.
Python 2.7 is the last release in the 2.x series. It's still widely used but no longer actively developed.
Python 3 is the most recent version, with continuous updates and improvements. It's considered the present and future of the language.
- Print Statement: Python 2 uses
printas a statement, while Python 3 requires it to be used as a function:print(). - String Type: In Python 2, there are two main string types:byte andUnicode strings. In Python 3, all strings are Unicode by default.
- Division: In Python 2, integer division results in an integer. Python 3 has a distinct operator
//for this, while/gives a float. - Error Handling: Error handling is more uniform in Python 3; exceptions should be enclosed in parentheses in
exceptstatements.
Given that Python 2.x has reached its official end of life, businesses and communities are transitioning to Python 3 to ensure ongoing support, performance, and security updates. It's vital for developers to keep these differences in mind when migrating projects or coding in Python, especially for modern libraries and frameworks that might only be compatible with Python 3.
Python employs anautomatic memory management process, commonly known asgarbage collection.
This mechanism, combined withdynamic typing and the use ofreferences rather than direct memory addresses, affords Python both advantages and limitations.
Ease of Use: Developers are relieved of manual memory management tasks, reducing the likelihood of memory leaks and segmentation faults.
Flexibility: Python's dynamic typing allows for more intuitive and rapid development without needing to pre-define variable types.
Abstraction: The absence of direct memory addressing simplifies code implementation, promoting a focus on higher-level tasks.
Performance Overhead: Garbage collection and dynamic typing can introduce latency, potentially impacting real-time or low-latency applications.
Resource Consumption: The garbage collection process consumes CPU and memory resources, sometimes resulting in inefficient use of system resources.
Fragmentation: Continuous allocation and deallocation of memory can lead to memory fragmentation, affecting overall system performance.
Memory Layout: Python's memory consists of three primary areas: the code segment, global area, and stack and heap for runtime data.
Reference Counting: Python uses a mechanism that associates an object with the number of references to it. When the reference count drops to zero, the object is deleted.
Automated Garbage Collection: Periodically, Python scans the memory to identify and recover objects that are no longer referenced.
Here is the Python code:
importsys# Define and reference an objectx= [1,2,3]y=x# Obtain the reference countref_count=sys.getrefcount(x)print(ref_count)# Output: 2
In this example, the list[1, 2, 3] has two references,x andy.Note:sys.getrefcount returns the actual count plus one.
PEP 8, short forPython Enhancement Proposal 8, is a style guide for Python code. Created by Guido van Rossum, it sets forth recommendations for writing clean, readable Python code.
Readability: Code structure and naming conventions should make the code clear and understandable, especially for non-authors and during collaborative efforts.
Consistency: The guide aims to minimize surprises by establishing consistent code styles and structures.
Maintainability: Following PEP 8 makes the codebase easier to manage, reducing technical debt.
PEP 8 addresses different aspects of Python coding, including:
- Indentation: Four spaces for each level, using spaces rather than tabs.
- Line Length: Suggests a maximum of 79 characters per line for readability.
- Blank Lines: Use proper spacing to break the code into logical segments.
- Imports: Recommended to group standard library imports, third-party library imports, and local application imports and to sort them alphabetically.
- Whitespace: Define when to use spaces around different Python operators and structures.
- Naming Conventions: Dissect different naming styles for modules, classes, functions, and variables.
- Comments: Recommends judicious use of inline comments and docstrings.
Here is Python code that adheres to some PEP 8 guidelines:
# Good - PEP 8 CompliantimportosimportsysfromcollectionsimportCounterfrommyappimportMyModuledefcalculate_sum(a,b):"""Calculate and return the sum of a and b."""returna+bclassMyWidget:def__init__(self,name):self.name=name# Not Recommended - Non-Compliant CodedefcalculateProduct(a,b):# InlineComment#someRandomEquation= 1*a**2 / bsomeRandomEquation=1*a**2/b# Suggested to match previous linereturnsomeRandomEquation
Let's discuss the key features, use-cases, and main points of difference amongPython lists, tuples, andsets.
- Lists: Ordered, mutable, can contain duplicates, and are defined using square brackets
[]. - Tuples: Ordered, immutable, can contain duplicates, and are defined using parentheses
(). - Sets: Unordered, mutable, and do not contain duplicates. Sets are defined using curly braces
{}, but for an empty set, you should useset()to avoid creating an empty dictionary inadvertently.
Here is the Python code:
# Definingmy_list= [1,2,3,4,4]# listmy_tuple= (1,2,3,4,4)# tuplemy_set= {1,2,3,4,4}# set# Outputprint(my_list)# [1, 2, 3, 4, 4]print(my_tuple)# (1, 2, 3, 4, 4)print(my_set)# {1, 2, 3, 4}
In the output, we observe that the list retained all elements, including duplicates. The tuple behaves similarly to a list but is immutable. The set automatically removes duplicates.
Let's consider a scenario where you might uselists,tuples, andsets when dealing with phone contacts.
List (Ordered, Mutable, Duplicates Allowed): Useful for managing a contact list in an ordered manner, where you might want to add, remove, or update contacts. E.g.,
contact_list = ["John", "Doe", "555-1234", "Jane", "Smith", "555-5678"].Tuple (Ordered, Immutable, Duplicates Allowed): If the contact details are fixed and won't change, you can use a tuple for each contact record. E.g.,
contacts = (("John", "Doe", "555-1234"), ("Jane", "Smith", "555-5678")).Set (Unordered, Mutable, No Duplicates): Helpful when you need to remove duplicates from your contact list. E.g.,
unique_numbers = {"555-1234", "555-5678"}.
- Advantages: Versatile, allows duplicates, supports indexing and slicing.
- Disadvantages: Slower operations for large lists.
- Advantages: More memory-efficient, suitable for read-only data.
- Disadvantages: Once defined, its contents can't be changed.
- Advantages: High-speed membership tests and avoiding duplicates.
- Disadvantages: Not suitable for tasks requiring order.
Adictionary in Python is a powerful, built-in data structure for holdingunordered key-value pairs. Keys are unique, immutable objects such as strings, numbers, or tuples, while values can be any type of object.
- Unordered: Unlike lists, which are indexed, dictionaries have no specific sequence of elements.
- Mutable: You can modify individual entries, but keys are fixed.
- Dynamic: Dictionaries can expand or shrink in size as needed.
Dictionaries are defined within curly braces{}, and key-value pairs are separated by a colon. Pairs are themselves separated by commas.
Here is the Python code:
my_dict= {'name':'Alice','age':30,'is_student':False}
dict.keys(): Returns all keys in the dictionary.dict.values(): Returns all values in the dictionary.dict.items(): Returns a list of key-value pairs.
Here is the Python code:
my_dict= {'name':'Alice','age':30,'is_student':False}# Accessing individual itemsprint(my_dict['name'])# Output: Aliceprint(my_dict.get('age'))# Output: 30# Changing valuesmy_dict['age']=31# Inserting new key-value pairsmy_dict['gender']='Female'# Deleting key-value pairsdelmy_dict['is_student']# Iterating through keys and valuesforkeyinmy_dict:print(key,':',my_dict[key])# More concise iteration using dict.items()forkey,valueinmy_dict.items():print(key,':',value)
Dictionaries in Python use a variation of ahash table. Their key characteristic is that they are very efficient for lookups (
List comprehension is a concise way to create lists in Python. It is especially popular in data science for its readability and efficiency.
The basic structure of a list comprehension can be given by:
squared= [x**2forxinrange(10)]
This code is equivalent to:
squared= []forxinrange(10):squared.append(x**2)
- Filtering: You can include an
ifstatement to filter elements. - Multiple Iterables: List comprehensions can iterate over multiple iterables in parallel.
- Set and Dictionary Comprehensions: While we're discussing list comprehensions, it's noteworthy that Python offers similar mechanisms for sets and dictionaries.
Consider filtering a list of numbers for even numbers and then squaring those. Here is what it looks like using traditional loops:
evens_squared= []fornuminrange(10):ifnum%2==0:evens_squared.append(num**2)
Here is the equivalent using a list comprehension.
evens_squared= [num**2fornuminrange(10)ifnum%2==0]
Theability to use list comprehensions can make operationsfaster than using traditional loops, as every list comprehension has an equivalent loop (it is a syntactic sugar). If you had to create very long lists in a loop, a list comprehension can offer a performance improvement.
In Python,generators andlist comprehensions are tools for constructing and processing sequences (like lists, tuples, and more). While both produce sequences, they differ in how and when they generate their elements.
List comprehensions are concise and powerful constructs for building and transforming lists. They typically build the entire list in memory at once, making them suitable for smaller or eagerly evaluated sequences.
Here is an example of a list comprehension:
squares= [x**2forxinrange(10)]print(squares)# Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Generators, on the other hand, are memory-efficient, lazy sequences. They produce values on-the-fly when iterated, making them suitable for scenarios with potentially large or infinite datasets.
This is how you define a generator expression:
squared= (x**2forxinrange(10))print(type(squared))# Output: <class 'generator'># When you want to retrieve the elements, you can iterate over it.fornuminsquared:print(num)
- Memory Efficiency: Generators produce values one at a time, potentially saving significant memory.
- Composability: They can be composed and combined using methods like
map,filter, and others, making them quite flexible. - Infinite Sequences: Generators can model potentially infinite sequences, which would not be possible to represent with a list.
Usingsys.getsizeof, let's compare the memory usage of a list versus a generator that both yield square numbers.
importsys# Memory usage for listlist_of_squares= [x**2forxinrange(1,1001)]print(sys.getsizeof(list_of_squares))# Memory usage for generatorgen_of_squares= (x**2forxinrange(1,1001))print(sys.getsizeof(gen_of_squares))"""Output:4056 # Memory in bytes for the list120 # Memory in bytes for the generator"""
In Python,args andkwargs are terms used to indicate that a function can accept a variable number of arguments and parameters, respectively.
*args is used to capture an arbitrary or zero number ofpositional arguments. When calling a function with '*args', the arguments are collected into a tuple within the function. This parameter allows for a flexible number of arguments to be processed.
Here's an example:
defsum_all(*args):returnsum(args)print(sum_all(1,2,3))# Output: 6
**kwargs is utilized to capture an arbitrary or zero number ofkeyword arguments. When calling a function with**kwargs, the arguments are collected into a dictionary within the function. The double star indicates that it's a keyword argument.
This feature is especially handy when developers are unsure about the exact nature or number of keyword arguments that will be transmitted.
Here's an example:
defdisplay_info(**kwargs):forkey,valueinkwargs.items():print(f"{key}:{value}")display_info(name="Alice",age=25,location="New York")# Output:# name: Alice# age: 25# location: New York
Developers also have theflexibility to use both*args and**kwargs together in a function definition, allowing them to handle a mix of positional and keyword arguments.
Here's an example demonstrating mixed usage:
defprocess_data(title,*args,**kwargs):print(f"Title:{title}")print("Positional arguments:")forarginargs:print(arg)print("Keyword arguments:")forkey,valueinkwargs.items():print(f"{key}:{value}")process_data("Sample Data",1,2,complex_param=[4,5,6])# Output:# Title: Sample Data# Positional arguments:# 1# 2# Keyword arguments:# complex_param: [4, 5, 6]
Python employs automaticgarbage collection to manage memory, removing the need for manual memory management.
Python employs areference counting strategy along with acycle detector for more complex data structures.
Reference Counting:
- Each Python object contains a
gc_refcntmember, which is a count of the number of references that the object has. - When an object is created or a reference to it is copied or deleted,
gc_refcntis updated accordingly.
Reference counting ensuresimmediate object reclamation when an object is no longer referenced (i.e.,
gc_refcntreaches 0). However, it has limitations in handlingcyclic references and may lead tofragmentation.- Each Python object contains a
Cycle Detector:
- Python uses standardmark-and-sweep garbage collection, along with acycle detector to handle cyclic references.
- Common cyclic structures include bidirectional lists, parent-child relationships, and singleton referential patterns.
- The cycle detector periodically runs in the background to clean up any uncollectable cycles. While this mechanism is efficient, it can lead to unpredictable garbage collection times and might not fully remove all cyclic references immediately.
- Avoid Unnecessary Long-Lived References: To ensure timely object reclamation, limit the scope of references to the minimum required.
- Leverage Context Managers: Utilize the
withstatement to encapsulate object references. This ensures the release of resources at the end of the block or upon an exception. - Consider Explicit Deletion: In rare cases where it's necessary, you can manually delete references to objects using the
delkeyword. - Use Garbage Collection Module: The
gcmodule provides utilities like enabling or disabling the garbage collector and manual triggers for object reclamation. However, it's important to use such options judiciously, as overuse can impact performance.
The strategies and mechanisms discussed are specific to CPython, the reference implementation of Python. Other Python implementations like Jython (for Java), IronPython (for .NET), and PyPy may employ different garbage collection methods for memory management.
Decorators in Python are higher-order functions that modify or enhance the behavior of other functions. They achieve this by taking a function as input, wrapping it inside another function, and then returning the wrapper.
Decorators are often used in web frameworks, such as Flask, for tasks like request authentication and logging. They enable better separation of concerns and modular code design.
- Debugging: Decorators can log function calls, parameter values, or execution time.
- Authentication and Authorization: They ensure functions are only accessible to authorized users or have passed certain validation checks.
- Caching: Decorators can store results of expensive function calls, improving performance.
- Rate Limiting: Useful in web applications to restrict the number of requests a function can handle.
- Validation: For data integrity checks, ensuring that inputs to functions meet certain criteria.
Here is the Python code:
importtimedeftimer(func):"""Decorator that times function execution."""defwrapper(*args,**kwargs):start_time=time.time()result=func(*args,**kwargs)end_time=time.time()print(f"{func.__name__} took:{end_time-start_time} seconds")returnresultreturnwrapper@timerdefsleep_and_return(num_seconds):"""Function that waits for a given number of seconds and then returns that number."""time.sleep(num_seconds)returnnum_secondsprint(sleep_and_return(3))# Output: 3, and the time taken is printed
11. List thePython libraries that are most commonly used inmachine learning and their primary purposes.
Here are some of the most widely usedPython libraries for machine learning, along with their primary functions.
Key Features:
- A collection of algorithms for numerical optimization, integration, interpolation, Fourier transforms, signal processing, and linear algebra.
Libraries in SciPy:
sc.pi: Approximation of cool things to pi.sc.mean,sc.median: Calculation of mean and median.subprocess.call: Call external command.and others: Lots of linear algebra tools.
Key Features:
- Core library for numerical computing with a strong emphasis on multi-dimensional arrays.
- Provides mathematical functions for multi-dimensional arrays and matrices.
Libraries in NumPy:
numpy.array: Define arrays.numpy.pi: The mathematical constant π.numpy.sin,numpy.cos: Trigonometric functions.numpy.sum,numpy.mean: Basic statistical functions.numpy.linalg.inv,numpy.linalg.det: Linear algebra functions (matrix inversion and determinant).
Key Features:
- The go-to library for data manipulation and analysis.
- Offers versatile data structures such as Series (1D arrays) and DataFrames (2D tables).
Libraries in Pandas:
pandas.Series: Create and manipulate 1D labeled arrays.pandas.DataFrame: Build and work with labeled 2D tables.pandas.read_csv,pandas.read_sql: Read data from various sources like CSV files and SQL databases.- Data Cleaning and Preprocessing Tools:
fillna(),drop_duplicates()and others. pandas.plotting: Functions for data visualization.
Key Features:
- A comprehensive library for creating static, animated, and interactive visualizations in Python.
- Offers different plotting styles.
Libraries in Matplotlib:
matplotlib.pyplot.plot: Create line plots.matplotlib.pyplot.scatter: Generate scatter plots.matplotlib.pyplot.hist: Build histograms.matplotlib.pyplot.pie: Create pie charts.
- A leading open-source platform designed formachine learning.
- Offers a comprehensive range of tools, libraries, and resources enabling both beginners and seasoned professionals to practiceDeep Learning.
- A high-level, neural networks library, running on top of TensorFlow or Theano.
- Designed to make experimentation and quick deployment of deep learning models seamless and user-friendly.
- A powerful toolkit for all thingsmachine learning, including supervised and unsupervised learning, model selection, and data preprocessing.
- A data visualization library that integrates seamlessly with pandas DataFrames.
- Offers enhanced aesthetic styles and several built-in themes for a visually appealing experience.
- A rich toolkit for natural language processing (NLP) tasks.
- Encapsulatestext processing libraries along with lexical resources such as WordNet.
- A well-established library forcomputer vision tasks.
- Collectively, this robust library has over 2500 optimized algorithms focused on real-time operations.
- These libraries offer exceptional speed and performance for gradient boosting.
- They do this by employing techniques like exclusive features and avoiding unnecessary memory allocation.
- A useful option for Big Data applications, particularly when coupled with Apache Spark.
- It integrates seamlessly with RDDs, DataFrames, and SQL.
- A comprehensive library encompassing tools forstatistical modeling, hypothesis testing, and exploring datasets.
- It offers a rich set of regression models, including Ordinary Least Squares (OLS) and Generalized Linear Models (GLM).
- There are plenty of other libraries catering to specific areas, such as
h2ofor machine learning,CloudCVfor cloud-based computer vision, andImbalanced-learnfor handling imbalanced datasets inclassification tasks.
NumPy is a fundamental package used in scientific computing and a cornerstone of many Python-based machine learning frameworks. It provides support for the efficient manipulation of multi-dimensional arrays and matrices, offering a range of mathematical functions and tools.
- ndarray: NumPy's core data structure, the multi-dimensional array, optimized for numerical computations.
- Mathematical Functions: An extensive library of functions that operate on arrays and data structures, enabling high-performance numerical computations.
- Linear Algebra Operations: Sufficient support for linear algebra, including matrix multiplication, decomposition, and more.
- Random Number Generation: Tools to generate random numbers, both from different probability distributions and with various seeds.
- Performance Optimizations: NumPy is designed for optimized, fast, and fluent math operations that exceed Python's standard performance for loops or vectorized operations.
Data Representation: NumPy offers an efficient way to manipulate data, a key ingredient in most machine learning algorithms.
Algorithms and Analytics: Many machine learning libraries leverage NumPy under the hood. It's instrumental in tasks such as data preprocessing, feature engineering, and post-training analytics.
Data Integrity and Homogeneity: ML algorithms often require a consistent data type and structure, which NumPy arrays guarantee.
Compatibility with Other Libraries: NumPy arrays are often the input and output of other packages, ensuring seamless integration and optimized performance.
Here is the Python code:
importnumpyasnp# Create a random dataset for demonstrationnp.random.seed(0)data=np.random.rand(10,4)# Center the datadata_mean=np.mean(data,axis=0)data_centered=data-data_mean# Calculate the covariance matrixcov_matrix=np.cov(data_centered,rowvar=False)# Eigen decomposition_,eigen_vectors=np.linalg.eigh(cov_matrix)# Project data onto the computed eigen vectorsprojected_data=np.dot(data_centered,eigen_vectors)print(projected_data)
Pandas is a powerful Python library for data manipulation, analysis, and visualization. Its flexibility and wealth of capabilities have made it indispensable across industries.
- Series: A one-dimensional array with labels that supports many data types.
- DataFrame: A two-dimensional table with rows and columns. It's the primary Pandas data structure.
- Data Alignment: Ensures different data structures are aligned appropriately.
- Integrated Operations: Allows for efficient handling of missing data.
- File I/O: Pandas supports numerous file formats, including CSV, Excel, SQL databases, and more, making data import and export seamless.
- Data Integration: Offers robust methods for combining datasets.
- Well-suited for working with time-based data, it provides convenient functionalities, such as date range generation.
- Offers an interface to Matplotlib for straightforward data plot generation.
- Includes interactive plotting features.
- Provides support for out-of-core data processing through the 'chunking' method.
- Utilizes Cython and other approaches to improve performance.
- Simplifies tasks, such as database-style join operations between DataFrame objects.
- Offers intuitive methods to filter data based on certain conditions.
- Supports data-grouping operations with aggregate functionalities.
- Allows for custom function application through the
apply()method. - Supports multi-indexing, which means using more than one index level.
- Integrates simple yet effective tools for dealing with null values or missing data.
- Offers capabilities for data normalization and transformation.
- Familiar statistical functions, like mean, standard deviation, and others, are built-in for quick calculations.
- Supports generation of descriptive statistics.
- Pandas'
straccessor enables efficient handling of text data.
- Optimizes memory usage and provides enhanced computational speed for categorical data.
- Provides numerous methods for managing and handling DataFrames and Series efficiently.
- Users can enhance Pandas' capabilities through various add-ons, such as 'pandas-profiling' or 'pandasql'.
- Good for Small to Mid-size Data: It's especially helpful when handling datasets that fit in memory.
- Rich Data Structures: Offers a variety of structures efficient for specific data handling tasks.
- Integrated with Core Data Science Stack: Seamless compatibility with tools like NumPy, SciPy, and scikit-learn.
- Comprehensive Functionality: Provides a wide range of methods for almost all data manipulation requirements.
- Data Analysis Boost: It uniquely combines data structures and methods to elevate data exploration and analysis workflows.
Scikit-learn is a modular, robust, and easy-to-use machine learning library for Python, with a powerful suite of tools tailored to both model training and evaluation.
- These are algorithms for model exploration and perfect forboth supervised and unsupervised learning. They include classifiers, regressors, and clustering tools like K-means.
Core Methods:
.fit(): Model training..predict(): Making predictions in supervised settings..transform(): Transforming or reducing data, commonly in unsupervised learning..fit_predict(): Combining training and prediction in specific cases.
- These convert or alter data, providing a helpful toolbox for preprocessing. Bothunsupervised learning tasks (like feature scaling and PCA) andsupervised learning tasks (like feature selection and resampling) are supported.
Core Methods:
.fit(): Used to ascertain transformation parameters from training data..transform(): Applied to data after training..fit_transform(): A convenience method combining the fit and transform operations.
- These organize transformations and models into a single unit, ensuring that all steps in the machine learning process are orchestrated seamlessly.
Core Methods:
.fit(): Executes the necessary fit and transform steps in sequence..predict(): After data is transformed, generates predictions of target variable.
- The library boasts a vast array of techniques for assessing model performance. It supports methods tailored to specific problem types, such as classification or regression.
- Unified API: Scikit-learn presents a consistent interface across all supported algorithms.
- Interoperability: Functions are readily combinable and adaptable, permitting tailored workflows.
- Robustness: Verbose documentation and built-in error handling.
- Model Evaluation: The library offers a suite of tools tailored towards model assessment and cross-validation.
- Performance Metrics Suite: A comprehensive collection of scoring metrics for every machine learning problem imaginable.
Here is the Python code:
fromsklearn.treeimportDecisionTreeClassifier# Create a classifierclf=DecisionTreeClassifier()# Train the classifierclf.fit(X_train,y_train)# Use the trained classifier for predictiony_pred=clf.predict(X_test)
Matplotlib is one of the most widely used libraries for data visualization in Python. It provides a wide range of visualizations, and its interface is highly flexible, allowing for fine-grained control.
Seaborn, built on top of Matplotlib, is a higher-level library that focuses on visual appeal and offers a variety of high-level plot styles. It simplifies the process of plotting complex data, making it especially useful for exploratory analysis.
- Core Flexibility: Matplotlib equips you to control every aspect of your visualization.
- Customizable Plots: You can customize line styles, colors, markers, and more.
- Subplots and Axes: Create multi-plot layouts and specify dimensions.
- Backends: Choose from various interactive and non-interactive backends, suiting different use-cases.
- Output Flexibility: Matplotlib supports a range of output formats, including web, print, and various image file types.
- High-Level Interface: Offers simpler functions for complex visualizations like pair plots and violin plots.
- Attractive Styles: Seaborn has built-in themes and color palettes for better aesthetics.
- Dataset Integration: Directly accepts Pandas DataFrames.
- Time-Saving Defaults: Many Seaborn plots provide well-optimized default settings.
- Categorical Plots: Specifically designed to handle categorical data for easier visual analysis.
Matplotlib: Meticulous control over markers, colors, and sizes.
importmatplotlib.pyplotaspltplt.scatter(x,y,c='red',s=100,marker='x')
Seaborn: Quick setup with additional features like trend lines.
importseabornassnssns.scatterplot(x,y,hue=some_category,style=some_other_category)
Matplotlib: Standard line plot visualization.
importmatplotlib.pyplotaspltplt.plot(x,y)
Seaborn: Offers different styles for lines, emphasizing on the trend.
importseabornassnssns.lineplot(x,y,estimator='mean')
Matplotlib: Default functionalities for constructing histograms.
importmatplotlib.pyplotaspltplt.hist(x,bins=10)
Seaborn: High-level interface for one-liner histograms.
importseabornassnssns.histplot(x,kde=True)
Matplotlib: Provides bar plots and enables fine-tuning.
importmatplotlib.pyplotaspltplt.bar(categories,values)
Seaborn: Specialized categorical features for easy category-specific analysis.
importseabornassnssns.catplot(x='category',y='value',kind='bar',data=data)
Matplotlib: Offers heatmap generation, but with more control and detailed setup.
importmatplotlib.pyplotaspltplt.imshow(data,cmap='hot',interpolation='none')
Seaborn: Simplified, high-level heatmap functionality.
importseabornassnssns.heatmap(data,annot=True,fmt="g")
While both Matplotlib and Seaborn allow customization, Seaborn stands out for its accessible interface. It comes with several built-in visual styles to enhance the aesthetics of plots.
The code for selecting a style:
importseabornassnssns.set_style("whitegrid")
- Performance: Matplotlib is faster when dealing with large datasets due to its lower-level operations.
- Specialized Plots: Seaborn excels in handling complex, multivariable datasets, providing numerous statistical and categorical plots out of the box.
Explore all 100 answers here 👉Devinterview.io - Python ML

About
🟣 Python Ml interview questions and answers to help you prepare for your next machine learning and data science interview in 2025.
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.