Oct 20, 2021 · Oct 20, 2021 · Nov 28, 2021 · Nov 28, 2021 · Jun 20, 2022 · Jun 20, 2022
diff --git a/Advanced/regex/regex_tutorial_exercise_answer.ipynb b/Advanced/regex/regex_tutorial_exercise_answer.ipynb
 {
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**1. Extract all twitter handles from following text. Twitter handle is the text that appears after https://twitter.com/ and is a single word. Also it contains only alpha numeric characters i.e. A-Z a-z , o to 9 and underscore _**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['elonmusk', 'teslarati', 'dummy_tesla', 'dummy_2_tesla']"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text = '''\n",
    "Follow our leader Elon musk on twitter here: https://twitter.com/elonmusk, more information \n",
    "on Tesla's products can be found at https://www.tesla.com/. Also here are leading influencers \n",
    "for tesla related news,\n",
    "https://twitter.com/teslarati\n",
    "https://twitter.com/dummy_tesla\n",
    "https://twitter.com/dummy_2_tesla\n",
    "'''\n",
    "pattern = 'https://twitter\\.com/([a-zA-Z0-9_]+)'\n",
    "\n",
    "re.findall(pattern, text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**2. Extract Concentration Risk Types. It will be a text that appears after \"Concentration Risk:\", In below example, your regex should extract these two strings**\n",
    "\n",
    "(1) Credit Risk\n",
    "\n",
    "(2) Supply Rish"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Credit Risk', 'Credit Risk']"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text = '''\n",
    "Concentration of Risk: Credit Risk\n",
    "Financial instruments that potentially subject us to a concentration of credit risk consist of cash, cash equivalents, marketable securities,\n",
    "restricted cash, accounts receivable, convertible note hedges, and interest rate swaps. Our cash balances are primarily invested in money market funds\n",
    "or on deposit at high credit quality financial institutions in the U.S. These deposits are typically in excess of insured limits. As of September 30, 2021\n",
    "and December 31, 2020, no entity represented 10% or more of our total accounts receivable balance. The risk of concentration for our convertible note\n",
    "hedges and interest rate swaps is mitigated by transacting with several highly-rated multinational banks.\n",
    "Concentration of Risk: Supply Risk\n",
    "We are dependent on our suppliers, including single source suppliers, and the inability of these suppliers to deliver necessary components of our\n",
    "products in a timely manner at prices, quality levels and volumes acceptable to us, or our inability to efficiently manage these components from these\n",
    "suppliers, could have a material adverse effect on our business, prospects, financial condition and operating results.\n",
    "'''\n",
    "pattern = 'Concentration of Risk: ([^\\n]*)'\n",
    "\n",
    "re.findall(pattern, text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**3. Companies in europe reports their financial numbers of semi annual basis and you can have a document like this. To exatract quarterly and semin annual period you can use a regex as shown below**\n",
    "\n",
    "Hint: you need to use (?:) here to match everything enclosed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['2021 Q1', '2021 S1']"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "text = '''\n",
    "Tesla's gross cost of operating lease vehicles in FY2021 Q1 was $4.85 billion.\n",
    "BMW's gross cost of operating vehicles in FY2021 S1 was $8 billion.\n",
    "'''\n",
    "\n",
    "pattern = 'FY(\\d{4} (?:Q[1-4]|S[1-2]))'\n",
    "matches = re.findall(pattern, text)\n",
    "matches"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
 }
diff --git a/Advanced/regex/regex_tutorial_exercise_questions.ipynb b/Advanced/regex/regex_tutorial_exercise_questions.ipynb
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1 align='center'>Python Regular Expression Tutorial Exericse</h1>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import re"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**1. Extract all twitter handles from following text. Twitter handle is the text that appears after https://twitter.com/ and is a single word. Also it contains only alpha numeric characters i.e. A-Z a-z , o to 9 and underscore _**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "text = '''\n",
    "Follow our leader Elon musk on twitter here: https://twitter.com/elonmusk, more information \n",
    "on Tesla's products can be found at https://www.tesla.com/. Also here are leading influencers \n",
    "for tesla related news,\n",
    "https://twitter.com/teslarati\n",
    "https://twitter.com/dummy_tesla\n",
    "https://twitter.com/dummy_2_tesla\n",
    "'''\n",
    "pattern = '' # todo: type your regex here\n",
    "\n",
    "re.findall(pattern, text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**2. Extract Concentration Risk Types. It will be a text that appears after \"Concentration Risk:\", In below example, your regex should extract these two strings**\n",
    "\n",
    "(1) Credit Risk\n",
    "\n",
    "(2) Supply Rish"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = '''\n",
    "Concentration of Risk: Credit Risk\n",
    "Financial instruments that potentially subject us to a concentration of credit risk consist of cash, cash equivalents, marketable securities,\n",
    "restricted cash, accounts receivable, convertible note hedges, and interest rate swaps. Our cash balances are primarily invested in money market funds\n",
    "or on deposit at high credit quality financial institutions in the U.S. These deposits are typically in excess of insured limits. As of September 30, 2021\n",
    "and December 31, 2020, no entity represented 10% or more of our total accounts receivable balance. The risk of concentration for our convertible note\n",
    "hedges and interest rate swaps is mitigated by transacting with several highly-rated multinational banks.\n",
    "Concentration of Risk: Supply Risk\n",
    "We are dependent on our suppliers, including single source suppliers, and the inability of these suppliers to deliver necessary components of our\n",
    "products in a timely manner at prices, quality levels and volumes acceptable to us, or our inability to efficiently manage these components from these\n",
    "suppliers, could have a material adverse effect on our business, prospects, financial condition and operating results.\n",
    "'''\n",
    "pattern = '' # todo: type your regex here\n",
    "\n",
    "re.findall(pattern, text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**3. Companies in europe reports their financial numbers of semi annual basis and you can have a document like this. To exatract quarterly and semin annual period you can use a regex as shown below**\n",
    "\n",
    "Hint: you need to use (?:) here to match everything enclosed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "text = '''\n",
    "Tesla's gross cost of operating lease vehicles in FY2021 Q1 was $4.85 billion.\n",
    "BMW's gross cost of operating vehicles in FY2021 S1 was $8 billion.\n",
    "'''\n",
    "\n",
    "pattern = '' # todo: type your regex here\n",
    "matches = re.findall(pattern, text)\n",
    "matches"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__[Solution](https://github.com/codebasics/py/blob/master/Advanced/regex/regex_tutorial_exercise_answer.ipynb)__"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
 }
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,154 @@
		{
		"cells": [
		{
		"cell_type": "code",
		"execution_count": 1,
		"metadata": {},
		"outputs": [],
		"source": [
		"import re"
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"1. Extract all twitter handles from following text. Twitter handle is the text that appears after https://twitter.com/ and is a single word. Also it contains only alpha numeric characters i.e. A-Z a-z , o to 9 and underscore _"
		]
		},
		{
		"cell_type": "code",
		"execution_count": 5,
		"metadata": {
		"scrolled": true
		},
		"outputs": [
		{
		"data": {
		"text/plain": [
		"['elonmusk', 'teslarati', 'dummy_tesla', 'dummy_2_tesla']"
		]
		},
		"execution_count": 5,
		"metadata": {},
		"output_type": "execute_result"
		}
		],
		"source": [
		"text = '''\n",
		"Follow our leader Elon musk on twitter here: https://twitter.com/elonmusk, more information \n",
		"on Tesla's products can be found at https://www.tesla.com/. Also here are leading influencers \n",
		"for tesla related news,\n",
		"https://twitter.com/teslarati\n",
		"https://twitter.com/dummy_tesla\n",
		"https://twitter.com/dummy_2_tesla\n",
		"'''\n",
		"pattern = 'https://twitter\\.com/([a-zA-Z0-9_]+)'\n",
		"\n",
		"re.findall(pattern, text)"
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"2. Extract Concentration Risk Types. It will be a text that appears after \"Concentration Risk:\", In below example, your regex should extract these two strings\n",
		"\n",
		"(1) Credit Risk\n",
		"\n",
		"(2) Supply Rish"
		]
		},
		{
		"cell_type": "code",
		"execution_count": 6,
		"metadata": {},
		"outputs": [
		{
		"data": {
		"text/plain": [
		"['Credit Risk', 'Credit Risk']"
		]
		},
		"execution_count": 6,
		"metadata": {},
		"output_type": "execute_result"
		}
		],
		"source": [
		"text = '''\n",
		"Concentration of Risk: Credit Risk\n",
		"Financial instruments that potentially subject us to a concentration of credit risk consist of cash, cash equivalents, marketable securities,\n",
		"restricted cash, accounts receivable, convertible note hedges, and interest rate swaps. Our cash balances are primarily invested in money market funds\n",
		"or on deposit at high credit quality financial institutions in the U.S. These deposits are typically in excess of insured limits. As of September 30, 2021\n",
		"and December 31, 2020, no entity represented 10% or more of our total accounts receivable balance. The risk of concentration for our convertible note\n",
		"hedges and interest rate swaps is mitigated by transacting with several highly-rated multinational banks.\n",
		"Concentration of Risk: Supply Risk\n",
		"We are dependent on our suppliers, including single source suppliers, and the inability of these suppliers to deliver necessary components of our\n",
		"products in a timely manner at prices, quality levels and volumes acceptable to us, or our inability to efficiently manage these components from these\n",
		"suppliers, could have a material adverse effect on our business, prospects, financial condition and operating results.\n",
		"'''\n",
		"pattern = 'Concentration of Risk: ([^\\n]*)'\n",
		"\n",
		"re.findall(pattern, text)"
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"3. Companies in europe reports their financial numbers of semi annual basis and you can have a document like this. To exatract quarterly and semin annual period you can use a regex as shown below\n",
		"\n",
		"Hint: you need to use (?:) here to match everything enclosed"
		]
		},
		{
		"cell_type": "code",
		"execution_count": 2,
		"metadata": {},
		"outputs": [
		{
		"data": {
		"text/plain": [
		"['2021 Q1', '2021 S1']"
		]
		},
		"execution_count": 2,
		"metadata": {},
		"output_type": "execute_result"
		}
		],
		"source": [
		"text = '''\n",
		"Tesla's gross cost of operating lease vehicles in FY2021 Q1 was $4.85 billion.\n",
		"BMW's gross cost of operating vehicles in FY2021 S1 was $8 billion.\n",
		"'''\n",
		"\n",
		"pattern = 'FY(\\d{4} (?:Q[1-4]\|S[1-2]))'\n",
		"matches = re.findall(pattern, text)\n",
		"matches"
		]
		}
		],
		"metadata": {
		"kernelspec": {
		"display_name": "Python 3",
		"language": "python",
		"name": "python3"
		},
		"language_info": {
		"codemirror_mode": {
		"name": "ipython",
		"version": 3
		},
		"file_extension": ".py",
		"mimetype": "text/x-python",
		"name": "python",
		"nbconvert_exporter": "python",
		"pygments_lexer": "ipython3",
		"version": "3.8.5"
		}
		},
		"nbformat": 4,
		"nbformat_minor": 4
		}
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,135 @@
		{
		"cells": [
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"<h1 align='center'>Python Regular Expression Tutorial Exericse</h1>"
		]
		},
		{
		"cell_type": "code",
		"execution_count": 3,
		"metadata": {},
		"outputs": [],
		"source": [
		"import re"
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"1. Extract all twitter handles from following text. Twitter handle is the text that appears after https://twitter.com/ and is a single word. Also it contains only alpha numeric characters i.e. A-Z a-z , o to 9 and underscore _"
		]
		},
		{
		"cell_type": "code",
		"execution_count": null,
		"metadata": {
		"scrolled": true
		},
		"outputs": [],
		"source": [
		"text = '''\n",
		"Follow our leader Elon musk on twitter here: https://twitter.com/elonmusk, more information \n",
		"on Tesla's products can be found at https://www.tesla.com/. Also here are leading influencers \n",
		"for tesla related news,\n",
		"https://twitter.com/teslarati\n",
		"https://twitter.com/dummy_tesla\n",
		"https://twitter.com/dummy_2_tesla\n",
		"'''\n",
		"pattern = '' # todo: type your regex here\n",
		"\n",
		"re.findall(pattern, text)"
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"2. Extract Concentration Risk Types. It will be a text that appears after \"Concentration Risk:\", In below example, your regex should extract these two strings\n",
		"\n",
		"(1) Credit Risk\n",
		"\n",
		"(2) Supply Rish"
		]
		},
		{
		"cell_type": "code",
		"execution_count": null,
		"metadata": {},
		"outputs": [],
		"source": [
		"text = '''\n",
		"Concentration of Risk: Credit Risk\n",
		"Financial instruments that potentially subject us to a concentration of credit risk consist of cash, cash equivalents, marketable securities,\n",
		"restricted cash, accounts receivable, convertible note hedges, and interest rate swaps. Our cash balances are primarily invested in money market funds\n",
		"or on deposit at high credit quality financial institutions in the U.S. These deposits are typically in excess of insured limits. As of September 30, 2021\n",
		"and December 31, 2020, no entity represented 10% or more of our total accounts receivable balance. The risk of concentration for our convertible note\n",
		"hedges and interest rate swaps is mitigated by transacting with several highly-rated multinational banks.\n",
		"Concentration of Risk: Supply Risk\n",
		"We are dependent on our suppliers, including single source suppliers, and the inability of these suppliers to deliver necessary components of our\n",
		"products in a timely manner at prices, quality levels and volumes acceptable to us, or our inability to efficiently manage these components from these\n",
		"suppliers, could have a material adverse effect on our business, prospects, financial condition and operating results.\n",
		"'''\n",
		"pattern = '' # todo: type your regex here\n",
		"\n",
		"re.findall(pattern, text)"
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"3. Companies in europe reports their financial numbers of semi annual basis and you can have a document like this. To exatract quarterly and semin annual period you can use a regex as shown below\n",
		"\n",
		"Hint: you need to use (?:) here to match everything enclosed"
		]
		},
		{
		"cell_type": "code",
		"execution_count": null,
		"metadata": {},
		"outputs": [],
		"source": [
		"text = '''\n",
		"Tesla's gross cost of operating lease vehicles in FY2021 Q1 was $4.85 billion.\n",
		"BMW's gross cost of operating vehicles in FY2021 S1 was $8 billion.\n",
		"'''\n",
		"\n",
		"pattern = '' # todo: type your regex here\n",
		"matches = re.findall(pattern, text)\n",
		"matches"
		]
		},
		{
		"cell_type": "markdown",
		"metadata": {},
		"source": [
		"__[Solution](https://github.com/codebasics/py/blob/master/Advanced/regex/regex_tutorial_exercise_answer.ipynb)__"
		]
		}
		],
		"metadata": {
		"kernelspec": {
		"display_name": "Python 3",
		"language": "python",
		"name": "python3"
		},
		"language_info": {
		"codemirror_mode": {
		"name": "ipython",
		"version": 3
		},
		"file_extension": ".py",
		"mimetype": "text/x-python",
		"name": "python",
		"nbconvert_exporter": "python",
		"pygments_lexer": "ipython3",
		"version": "3.8.5"
		}
		},
		"nbformat": 4,
		"nbformat_minor": 4
		}